An Analysis of Topic Modelling for Legislative Texts
                               James O’ Neill                                                               Cécile Robin
                     Insight Centre for Data Analytics                                           Insight Centre for Data Analytics
                            IDA Business Park                                                            IDA Business Park
                              Galway, Ireland                                                             Galway, Ireland
                      james.oneill@insight-centre.org                                             cecile.robin@insight-centre.org

                              Leona O’ Brien                                                               Paul Buitelaar
      Governance, Risk and Compliance Technology Centre                                         Insight Centre for Data Analytics
                   University College Cork                                                             IDA Business Park
                         Cork, Ireland                                                                   Galway, Ireland
                     leona.obrien@ucc.ie                                                        paul.buitelaar@insight-centre.org
ABSTRACT                                                                            the number of documents term t appears in. From a term-document
The uprise of legislative documents within the past decade has risen                matrix M, dimensionality reduction techniques are often used to
dramatically, making it difficult for law practitioners to attend to                reduce all terms to a set of concepts, which can be interpreted as ap-
legislation such as Statutory Instrument orders and Acts. This work                 proximations of “topics” in a given corpus. The matrix factorization
focuses on the use of topic models for summarizing and visualizing                  techniques we discuss include Singular Value Decomposition (SVD)
British legislation, with a view toward easier browsing and identifi-               and Non-Negative Matrix Factorization (NMF).
cation of salient legal topics and their respective set of topic specific
                                                                                       2.1.1 Non-Negative Matrix Factorization. NMF is specifi-
terms. We provide an initial qualitative evaluation from a legal ex-
                                                                                    cally for factorizing matrices with non-negative values, hence why it
pert on how the models have performed, by ranking them for each
                                                                                    is particularly suitable for term-document matrices. Since M is rep-
jurisdiction according to topic coherency and relevance.
                                                                                    resented as non-negative values, features are composed of additive
                                                                                    computations resulting in a part based representation (as opposed to
1    INTRODUCTION                                                                   subtracting values which would not lead to parts-based factored rep-
The legal domain is experiencing a major shift towards automated                    resentation) [6]. The objective of NMF is to find an approximation
tools that can perform tasks that are becoming increasingly diffi-                  of matrix M by factorizing it into W (r x k) and H (k x c) such that
cult for legal practitioners to carry out, due to the rate of change                M ≈ W H and k have lower rankthan M. The reconstruction error is
in the legal domain. Regulatory change (RC) is a notable area that                  minimized according to that shown in Equation 1 [11, 12].
has gained more attention in recent years due to the difficulties in                                                   n m
                                                                                                  1                 1 XX
compliance. In order to build automated solutions for compliance                                    |M − W H | 2F =           (Mi j − W Hi j )2        (1)
and verification, automated knowledge acquisition is an imperative                                2                 2 i=1 j=1
for related tasks. An initial step towards such a system requires an
overview/summarization of the core topics within the domain, in                        Also described by Lee and Seung [11], the multiplicative update
order to identify salient terms within the topics that are potentially              algorithm is used for updating both W and H . Both update rules
associated with compliance across various documents. Many ap-                       are outlined in Equation 2. The objective ensures the minimization
proaches in legal systems require metadata from an XML schema to                    is constrained to W and H being positive and that the distance D
carry out analysis such as topic modelling. This paper analyzes the                 between both is positive.
use of topic models to do this automatically from raw text. We start                                                        (W T M)α, µ
with a background to the models used for testing.                                                       H α, µ := H α, µ                    ,
                                                                                                                           (W T W H )α, µ
                                                                                                                                                       (2)
2 TOPIC MODELLING                                                                                                            (MH T )α,i
                                                                                                           Wi,α := Wi,α
2.1 Dimensionality Reduction Approaches                                                                                     (W HH T )α,i
A basic approach to modelling topics is to view a corpus as a set                       In this work, instead of using gradient descent to minimize the
of term frequencies (tf) where the weight for each term is also de-                 sum of squared (euclidean) distance (SSD) between M and W H , we
pendent on the inverse document frequency (idf) ( e.g “and” occurs                  use the Coordinate Descent solver. Lin et al. [13] describe the pro-
many times in a document, therefore its weight is low). Formally,                   cess that builds upon the multiplicative update algorithm by applying
           N
ft,d ∗ log    where N represents the number of documents and nt is                  Alternating Non-negative Least Squares (ANLS) using projected
           nt                                                                       gradient descent which is a parameter estimator with lower-bounded
In: Proceedings of the Second Workshop on Automated Semantic Analysis of Informa-   constraints. Although, NMF is widely used for topic modelFling [21],
tion in Legal Text (ASAIL 2017), June 16, 2017, London, UK.                         it is sometimes known to produce non-meaningful topics, particu-
Copyright © 2017 held by the authors. Copying permitted for private and academic
purposes.
                                                                                    larly if a term-document matrix is relatively sparse. Therefore, the
Published at http://ceur-ws.org                                                     identification of both rare and non-distinct terms is an important step
ASAIL 2017, June 16, 2017, London, UK                                                                                                                      J. O’ Neill et al.


to consider for removal before factorization. Furthermore, NMF can                        and less general terms are desired by the legal practitioners, hence
be prone to local minima.                                                                 we use this saliency measure in our analysis.
   2.1.2 Singular Value Decomposition. SVD decomposes a                                                                          X                  P(K |w)
matrix into three parts as shown in Equation (3) in order to find                                               S(w) = P(w)           P(K |w) log                        (4)
                                                                                                                                                     P(K)
a lower rank1 approximation of the term-document matrix. Con-                                                                     T
sider M to be a tf-idf matrix representation of the corpus, where                            Sievert and Shirley [19] describe the relevance measure, also
U diagonalizes MM T and ui represents the corresponding eigen-                            shown in 5, where ϕ k (w ) is the probability of w for topic k and p(w)
vector. Similarly V ∗ diagonalizes M T M and vi represents M T M                          is the probability of observing w in corpus D. In this work, λ is
eigenvectors. The diagonal values of Σ are ordered singular values2 .                     can be chosen between 0-1. We set λ according to term relevance
                                                                                          judgments made by a legal practitioner, prior to the final analysis of
                             M = U ΣV ∗                             (3)                   each topic model.
   SVD on a term-document matrix is also referred to as Latent                                                                                           ϕ k(w )
Semantic Analysis (LSA), as the lower ranked matrix M is said                                             r (w, k |λ) = λ log(ϕ k(w ) ) + (1 − λ) log(             )     (5)
                                                                                                                                                         p(w)
to represent a latent semantic space. In information retrieval, it is
referred to as Latent Semantic Indexing (LSI), where SVD is used to                       2.3      Saffron
index documents by representing documents (document-document)                             Saffron is a software tool3 that can construct a model-free topic hier-
and terms (document-term where terms are query terms) in vector                           archy. It extracts terms related to the domain of expertise, establish
space where the elements in the vector correspond to the degree that                      semantic relations between them, and constructs a taxonomy out of
a term or document has to a given topic. The similarity between a                         it. Saffron also deals with multiword expressions, which can improve
query and a given set of documents can then be determined using a                         topic coherency as phrases are often necessary for better readability
term-topic-matrix [18]. This is particularly helpful for distinguishing                   and understanding.
polysemous and synonymous terms.                                                              Saffron builds the topic hierarchy of a corpus by first capturing
                                                                                          the expertise domain through a model represented as single-word
2.2      Latent Dirichlet Allocation                                                      list. The latter is extracted using feature selection during a term
Latent Dirichlet Allocation was first introduced by Blei et al. [2] and                   and linguistic pattern extraction phase. It uses constraints such as
has since been a state of the art (SoTA) topic model, showing to have                     limiting to contentful parts-of-speech, to single words (in order to
more expressiveness over probabilistic LSA (pLSA) [3]. LDA builds                         target a more generic level) and to terms distributed across at least a
a Bayesian generative model using Dirichlet priors for topic mixtures                     1/4 of the corpus (for the specificity to the area of expertise). Topic
(an assumed prior probability for each topic distribution, Dirichlet is                   coherency, which is a main issue for statistically driven models in
a set of categorical distributions in this sense), in contrast to pLSA                    order for Subject Matter Experts (SMEs) to reply upon them, is
that can be considered to use uniform prior distribution for the topic                    tackled here by using semantic relatedness to filter the candidate
mixtures. Further extensions since then have been made to improve                         words. It is interpreted here as a domain coherency measure using
and adapt this model in a continuous space setting. In this sense,                        Pointwise Mutual Information (PMI) (see [4] for more details). The
continuous word embeddings are used. Categorical distributions                            domain model is then used as a base to measure the coherence of the
are replaced with multivariate Gaussian distributions, meaning that                       topics within the domain in the next phase.
Gaussian LDA has the capability of handling out of vocabulary                                 After extracting candidate terms following a standard multi-word
words on unseen text [8]. The probability of word w is dependent                          term extraction technique (see [4]), the first step involves searching
on a topic k in z which is dependent on probability of a document                         for words from the domain model in the immediate context of those
θd that is drawn from a Dirichlet prior α. Likewise, a word w is also                     candidates. This allows to determine a term’s coherence within the
dependent on the probability ϕ that a word w is in topic k.                               domain. This is achieved again through PMI calculation, by using
   The LDA generative process is described by Blei [3]. For each                          top level terms to extract intermediate level terms.
document, a parameter θd is chosen from a Dirichlet prior distribu-                           To create the pruned graph which represents the taxonomy, the
tion, then for each word in d a topic category is chosen according to                     strength of relationship between two research terms is measured,
the Dirichlet. A word w is generated afterwards, given the topic zw                       defined as Ii j = D i j /(D i × D j ) where D i is the number of articles
and β.                                                                                    that mention the term Ti in our corpus, D j is number of articles that
   The aforementioned Gaussian LDA represents these words as con-                         mention the term T j , and D i j is the number of documents where
tinuous embedded vectors instead of discrete co-occurrence counts,                        both terms appear. Edges are added in the graph for all the pairs
replacing the categorical distributions for zn and w n with Gaussians.                    that appear together in at least three documents, threshold fixed
   The saliency of terms within a topic is considered by [7] and                          based on the results of previous studies and tests (see [4] for more
formulated in Equation 4. A distinctive word w is a word that has a                       details). Saffron also uses a generality measure to direct edges from
higher log-likelihood of being in a topic K compared to a random                          generic concepts to more specific ones. This results in a dense, noisy
word. Hence, if a word w occurs in many topics it is non-informative,                     directed graph that is further trimmed using a specific branching
resulting in lower saliency. More informative topic-specific terms                        algorithm which was successfully applied for the construction of
1 The rank of a matrix is the number of linearly independent column vectors in a matrix   domain taxonomies in [14]. This yields a tree structure where the
(e.g document-term matrix), which can be used to reconstruct all column vectors.
2 singular values are the square root of the eigenvalue                                   3 see here - http://saffron.insight-centre.org/
An Analysis of Topic Modelling for Legislative Texts                                                   ASAIL 2017, June 16, 2017, London, UK


root is the most generic term and the leaves are the most specific         to all topic models. United Kingdom legislative texts were used for
terms.                                                                     topic modelling5 . The corpus contains 41,518 documents between
                                                                           2000 - 2016. However, for practical purposes the analysis is carried
3     RELATED WORK                                                         out on the year 2016, only to lessen the reading burden on the
Wiltshire et al. [20] introduced a large scale machine learning sys-       legal practitioner. The legislative types consist of the following: 304
tems that incorporates the use of hierarchical topic construction after    Northern Ireland Statutory Rules, 838 UK Statutory Instruments, 132
the extraction of terms, legal phrases and case cites. Their system        Welsh Statutory Instruments and 317 Scottish Statutory Instruments.
allows for a ranking and classification of topics given a legal concept
as input according to a scoring criterion. George et al. [10] provide      4.1     Text Preprocessing
a legal system for ranking documents according to their similarity         Corpus specific regular expressions (RE) are used to clean legal do-
to legal cases by finding similarity between documents in the la-          main syntax (e.g bracketed alphanumerics), followed by tokenization
tent topic space and query terms. They then use human assistance           and lemmatization using the WordNet lemmatizer [9]. The struc-
to provide annotate documents that are relevant to the query in a          ture usually contains nested expressions e.g (ii) followed by (a) and
semi-supervised fashion. In contrast, our work is fully unsupervised       (b) subsections. This syntax is removed using the regular expres-
with no human assistance during the topic modelling process. LDA           sions along with other standard RE for identifying references and
has been used extensively on natural language texts such as social         alphanumeric expressions e.g “Regulation EC No. 1370/2007 means
media texts [16], publication texts, newspapers etc. and typically not     Regulation 1370/2007 ...”. Redundant stopwords are removed from
in formal settings such as their use on legal texts.                       the corpora for word frequency f < 2. This is carried out under the
   Raghuveer and Kumar [17] use LDA to cluster Indian legal judge-         supervision of a subject expert by analysing a subsample of terms
ments and use cosine similarity as the distance measure between            which are considered for removal. We assume that terms with high
documents for clustering. However, their evaluation does not present       frequency are not specific to a particular topic e.g ’the’,’of’ etc. Also,
the prior knowledge of a legal expert to determine if the clusters         rare terms that occur infrequently are not representative of a single
coincide with legal knowledge within the domain.                           topic since they do not appear enough to infer that it is salient for a
   O’ Neill et al. [15] have identified salient legal statements (in       topic. Each corpus (corpus per jurisdiction) is then converted to a
contrast to salient topics) by extracting deontic modalities from          term-document matrix where weights are placed on each word using
using a small number of labeled samples to train a recurrent neural        the aforementioned tf-idf weighting scheme. Furthermore, 30 terms
network.                                                                   for all models except Saffron are listed for SME for ranking. For
   Ahmed and Xing [1] use dynamic HDP to track topic over time,            Saffron we rely on a visualization of the term hierarchy for a domain
documents can be exchanged however the ordering is intact. They            expert to judge.
also use longitudinal NIPS papers to track emerging topics and de-
caying topics (this is worth noting, particularly for tracking changing    4.2     Ranking Criterion and Model Configurations
topics around compliance issues).                                          In order for a legal practitioner to assess the models in a fair manner,
   The use of the aforementioned Saffron has been previously demon-        a set of guidelines are presented for the ranking of the models. An
strated through a wide range of projects from several domains and          important aspect to ranking is the pretuning of the term relevance
for different tasks. In [5], Bordea used Saffron’s topic extractor to      parameter λ, which chooses the top 30 terms that are presented
analyze legal documents arising around the financial crisis in 2008.       for each topic within the jurisdiction accordingly. We also assess a
She mapped the problem as an expert finding task, which aims at            number of parameter setting for NMF, LSA, LDA and HDP before
ranking people that have knowledge about a given topic. In that            finally choosing the final 10 set of topics which the legal expert
particular context, the task allowed the identification of individuals     makes their final judgment. Since the term-document matrix is quite
involved in defining the response of the U.S. government to the fi-        sparse (evident from 1), NMF is initialized using Non-Negative
nancial crisis by searching for a topic of interest. In [4], Saffron was   Singular Value Decomposition (NNSVD). The Coordinate Descent
used as a tool to detect the presence of different disciplines within      solver is used for minimizing the reconstruction error as mentioned
the field of Web Science. By running it on over 10 years of Web            in section 2.1.1. The number of components is set to nk = 10. LSI
Science conference series documents, it resulted on a discovery of 4       uses standard SVD which does not require much tuning only to
communities (Communication, Computer Science, Psychology, and              choose the number of singular values, also set nk = 10. For LDA we
Sociology), and trends over time and types of paper. Saffron was           choose low relevance λ = 0.25 to highlight topic specific terms.
also used in a demo for an Irish bookshop website4 to extract topics
from book descriptions/reviews and then classify them accordingly.         5     RESULTS
It was also used to link the books for the creation of a multi-level       In this section we analyse the topics retrieved for each approach, and
browsing application for book navigation.                                  an SME evaluated the topics for the regulations. Figure 1 simply
                                                                           compares the effects of dictionary size once infrequent terms are
4     METHODOLOGY                                                          increasingly removed. It is evident that after removing terms that
This section outlines the steps towards creating each topic model          occur less than twice, the corpus’ size dramatically decreases, mean-
and their configurations used for analysis. We start with a brief          ing that a significant number of terms are too specific to a particular
introduction to the corpora used and preprocessing steps common            document. We remove these terms for subsequent analysis.
4 see http://kennys.insight-centre.org/                                    5 Retrieved from: http://www.legislation.gov.uk/
ASAIL 2017, June 16, 2017, London, UK                                                                                           J. O’ Neill et al.


           Figure 1: Rare-word Removal For Each Corpus


                                                                            Figure 3: Latent Dirichlet Allocation terms for topic 10 of Northern
                                                                            Ireland Statutory Rules


Figure 2: LDA topics for Northern Ireland Statutory Rules projected to
2 principal components using multi-dimensional scaling (MDS)

                                                                            Figure 4: Support Allowance topic within Northern Ireland Statutory
   Latent Dirichlet Allocation Visualization. For the visualization         Rules
of LDA topics, we use the pyLDAvis [19] visualization tool. A mul-
tidimensional scaling projects the t dimensional space to a 2 dimen-
sions as shown in figure 2. Ten topics for Northern Ireland Statutory
                                                                            different aspects of it. We can see the advantage of the hierar-
Rules (NISR) are presented with the relevance metric set λ = 0.25
                                                                            chical structure of the graph, with semantically related topics go-
(which decides the term-topic specificity). This is done under the su-
                                                                            ing from the more generic to the more specialized ones. We can
pervision of a legal practitioner to ensure that λ is tuned to a correct
                                                                            this way identify a waterfall structure from the housing benefit
specificity and that topics are also coherent, before a final evaluation.
                                                                            branch, logically followed by the more specific local housing al-
   Some terms such as biomass, biomaterial, bioliquid, fossil and
                                                                            lowance, and then local housing allowance determination. An-
fuel show a clear and distinct topic and are quite topic specific given
                                                                            other quite clear example can be observed from the child sup-
λ = 0.25, shown by red bars which indicate the term frequency with
                                                                            port branch, related to the personal independence_payment node.
the given topic as opposed to the blue bar that indicate the term
                                                                            From child support, the directed edge links to child support main-
frequency among the whole corpus.
                                                                            tenance, then maintenance calculation, and finally the three topics
   Saffron. In Saffron’s results, a cluster is located around the ex-       child_support_maintenance_calculation_regulation, welfare service
tracted topic of department of justice, and support allowance which         and maintenance assessment. The police service node is at the root of
derives the whole taxonomy for the Northern Ireland Statutory               a taxonomy that includes children nodes northern_ireland_reserve
Rules. This topic is thus the primary node of the 2016 corpus.              ⇒ notice_of_appeal ⇒ written_representation,avoiding service ⇒
In Figure 4, we zoom in a subset of this graph (and thus sub-               reasonable_amount_of_duty_time. This example summary allows a
domains) which includes housing benefit, income support, social             legal practitioner to identify topics surrounding certain legal issues,
security, personal_independence payment. They all are semanti-              or for simply summarizing a complete jurisdiction. Zooming in on
cally related to the mother node support allowance, but tackling            a subset of the hierarchical tree, we highlight a topic with coherent
An Analysis of Topic Modelling for Legislative Texts                                                   ASAIL 2017, June 16, 2017, London, UK


                                                                           terms that correspond to compliance related issues. After evalua-
                                                                           tion Saffron has been consistently ranked as the most favourable of
                                                                           all models, as the aforementioned vocabulary pruning and usage
                                                                           of multi-word expressions has played a fundamental role in topic
                                                                           coherency. Standard LDA has performed the best of all single term
                                                                           models, particularly when top terms are chosen according to their
                                                                           topic specificity. HDP has inferred a similar number of topics as that
                                                                           of LDA according to an analysis of the log-likelihood curve and the
                                                                           legal practitioners judgment. This work is an early indication as to
                                                                           how legal practitioners can identify salient and coherent topics using
                                                                           automatic topic modelling tools.

                                                                           REFERENCES
                                                                            [1] Amr Ahmed and Eric P. Xing. 2012. Timeline: A Dynamic Hierarchical Dirichlet
                                                                                Process Model for Recovering Birth/Death and Evolution of Topics in Text Stream.
                                                                                CoRR abs/1203.3463 (2012). http://arxiv.org/abs/1203.3463
                                                                            [2] David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet
                                                                                allocation. Journal of machine Learning research 3, Jan (2003), 993–1022.
                                                                            [3] David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent dirichlet
                                                                                allocation. J. Mach. Learn. Res. 3 (March 2003), 993–1022. http://dl.acm.org/
                                                                                citation.cfm?id=944919.944937
                                                                            [4] Georgeta Bordea. 2013. Domain adaptive extraction of topical hierarchies for
                                                                                Expertise Mining. Ph.D. Dissertation.
                                                                            [5] Georgeta Bordea, Kartik Asooja, Paul Buitelaar, and Leona OâĂŹBrien. 2014.
                                                                                Gaining insights into the Global Financial Crisis using Saffron. NLP Unshared
                                                                                Task in PoliInformatics (2014).
                                                                            [6] Deng Cai, Xiaofei He, Xiaoyun Wu, and Jiawei Han. 2008. Non-negative ma-
                                                                                trix factorization on manifold. In Data Mining, 2008. ICDM’08. Eighth IEEE
                                                                                International Conference on. IEEE, 63–72.
                                                                            [7] Jason Chuang, Christopher D Manning, and Jeffrey Heer. 2012. Termite: Visu-
                                                                                alization techniques for assessing textual topic models. In Proceedings of the
                                                                                International Working Conference on Advanced Visual Interfaces. ACM, 74–77.
Figure 5: Police Service topic within Northern Ireland Statutory Rules      [8] Rajarshi Das, Manzil Zaheer, and Chris Dyer. 2015. Gaussian LDA for Topic
                                                                                Models with Word Embeddings.. In ACL (1). 795–804.
                                                                            [9] Christiane Fellbaum. 1998. WordNet. Wiley Online Library.
       Rank     NISR       SSI        UKSI          WSI                    [10] Clint Pazhayidam George, Sahil Puri, Daisy Zhe Wang, Joseph N Wilson, and
                                                                                William F Hamilton. 2014. SMART Electronic Legal Discovery Via Topic
       1        Saffron    Saffron    Saffron       Saffron                     Modeling.. In FLAIRS Conference.
       2        LDA        LDA        LDA           LDA                    [11] Daniel D Lee and H Sebastian Seung. 1999. Learning the parts of objects by
                                                                                non-negative matrix factorization. Nature 401, 6755 (1999), 788–791.
       3        HDP        NMF        HLDP/LSI      HLDP/LSI               [12] Daniel D Lee and H Sebastian Seung. 2001. Algorithms for non-negative matrix
       4        LSA        LSI        HLDP/LSI      HLDP/LSI                    factorization. In Advances in neural information processing systems. 556–562.
       5        NMF        HLDP       NMF           NMF                    [13] Chih-Jen Lin. 2007. Projected gradient methods for nonnegative matrix factoriza-
                                                                                tion. Neural computation 19, 10 (2007), 2756–2779.
      Table 1: Subject Matter Expert Ranking of Topic Models               [14] Roberto Navigli, Paola Velardi, and Stefano Faralli. 2011. A Graph-based Al-
                                                                                gorithm for Inducing Lexical Taxonomies from Scratch. In Proceedings of the
multi-word expressions summarizing an area within the Northern                  Twenty-Second International Joint Conference on Artificial Intelligence - Volume
                                                                                Volume Three (IJCAI’11). AAAI Press, 1872–1877. DOI:http://dx.doi.org/10.
Ireland Statutory Rules in Figure 5.                                            5591/978-1-57735-516-8/IJCAI11-313
                                                                           [15] James O’ Neill, Paul Buitelaar, Cecile Robin, and Leona O’ Brien. 2017. Classify-
   Ranking. Table 1 shows the results of SME ranking after assess-              ing sentential modality in legal language: a use case in financial regulations, acts
ing each topic model for each jurisdiction. Saffron overall is favored          and directives. In Proceedings of the 16th edition of the International Conference
for all jurisdictions, considering it is the only model that performs           on Articial Intelligence and Law. ACM, 159–168.
                                                                           [16] Marco Pennacchiotti and Siva Gurumurthy. 2011. Investigating topic models
multi-word expression topic extraction and weighting of descriptive             for social media user recommendation. In Proceedings of the 20th international
noun terms/phrases. We conjecture that the appeal of a hierarchi-               conference companion on World wide web. ACM, 101–102.
                                                                           [17] K Raghuveer. 2012. Legal documents clustering using latent dirichlet allocation.
cal structure and multi-word noun expressions has influenced the                IAES Int. J. Artif. Intell. 2, 1 (2012), 34–37.
interpretation of the salient terms in the domain, making it easier for    [18] Barbara Rosario. 2000. Latent semantic indexing: An overview. Techn. rep.
legal practitioners to identify important and coherent legal topics.            INFOSYS 240 (2000).
                                                                           [19] Carson Sievert and Kenneth E Shirley. 2014. LDAvis: A method for visualizing
   We emphasize at this point that single word topic models and                 and interpreting topics. In Proceedings of the workshop on interactive language
multi-word hierarchical models are not directly comparable for this             learning, visualization, and interfaces. 63–70.
reasons outlined however, they are included in table 1 to highlight        [20] James S Wiltshire Jr, John T Morelock, Timothy L Humphrey, X Allan Lu,
                                                                                James M Peck, and Salahuddin Ahmed. 2002. System and method for classifying
the importance of longer expressions that are linked in a taxonomy,             legal concepts using legal topic scheme. (Dec. 31 2002). US Patent 6,502,081.
providing more clarity on what the emerging topics are.                    [21] Xiaohui Yan, Jiafeng Guo, Shenghua Liu, Xueqi Cheng, and Yanfeng Wang.
                                                                                2013. Learning topics in short texts by non-negative matrix factorization on term
                                                                                correlation matrix. In Proceedings of the 2013 SIAM International Conference on
6   CONCLUSION                                                                  Data Mining. SIAM, 749–757.
This work has presented a fully automated approach for identifying
topics in regulations that assist in easier tracking of important domain