<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>June</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Semi-automated creation of regulation rule bases using generic template-driven rule extraction</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Deepali Kholkar</string-name>
          <email>deepali.kholkar@tcs.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sagar Sunkle</string-name>
          <email>sagar.sunkle@tcs.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vinay Kulkarni</string-name>
          <email>vinay.vkulkarni@tcs.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>TCS Research</institution>
          ,
          <addr-line>54B</addr-line>
          ,
          <institution>Hadapsar Industrial Estate</institution>
          ,
          <addr-line>Pune, Maharashtra 411013</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2017</year>
      </pub-date>
      <volume>16</volume>
      <issue>2017</issue>
      <abstract>
        <p>Formal approaches to checking compliance manually encode individual obligations from the regulation text as rules. Automated extraction approaches identify key elements in regulatory text, and create annotated, in some cases structured, representations of regulation text. It is desirable to combine the two approaches to automate creation of a regulation rule base that can be used for inferencing and reasoning about compliance. In this paper we present a semi-automated approach that uses a generic semantic model of regulations to guide automated extraction of rule suggestions. The suggestions help domain experts author rules in Structured English using the generic model as template. Rules are translated automatically into a Semantics of Business Vocabulary and Rules (SBVR) model and defeasible logic rules, creating a hierarchical knowledge base that reflects the regulation structure and enables querying and reasoning about compliance.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>Enterprises need to comply with a plethora of regulations. The
process of compliance is made more complex by the fact that regulatory
bodies publish guidelines pertaining to a single legislation in
multiple forms and regulatory documents such as directives, regulations,
and annexures containing supporting information such as reporting
formats, data descriptions, and example cases. It is a great deal of
effort for domain experts to manually compile, correlate, and
interpret information from all of these sources and translate it into
implementation of compliance.</p>
      <p>
        Several approaches exist for automated legal information
extraction that identify patterns and classify information available in
unstructured natural language text, annotate it, and in some cases
convert it into structured representations such as XML [
        <xref ref-type="bibr" rid="ref10 ref21 ref28 ref29">10, 21, 28, 29</xref>
        ].
The resultant rules are however, not in a logic form that can be
rigorously reasoned with. Formal compliance checking approaches on the
other hand use logic formalisms to represent regulation rules,
however, these need to be encoded manually by human experts [
        <xref ref-type="bibr" rid="ref26 ref27">26, 27</xref>
        ].
      </p>
      <p>
        Formal approaches in research usually encode a small subset of
rules from regulation text and demonstrate compliance to individual
rules. Encoding rules manually from the entire natural language
text of a regulation is a complex endeavor due to the volume of
text, legal language, and abstract nature of guidelines described. A
bigger knowledge engineering problem is creating an isomorphic
rule base structured such that it is an accurate representation of the
regulation, necessary for it to be usable by its users, and also easier
to maintain [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Getting the rule hierarchy, predicates, and arguments
of each rule right, necessary for correct inferencing, presents the
greatest complexity in manual rule creation. All of these require the
person(s) writing the rules to be an expert in the regulation domain
as well as formal logic, which is hard in practice. Building a rule
base of an entire regulation thus becomes a daunting task. With
multiple regulatory document sources to be considered for compliance,
(semi-) automated information extraction become desirable [
        <xref ref-type="bibr" rid="ref26 ref27">26, 27</xref>
        ].
Even then, building and structuring the rule base from extracted
information remains a major challenge.
      </p>
      <p>
        We carried out case study experiments of building rule bases for
two large real-life regulations, viz. MiFID-2 (Markets in Financial
Instruments Directive) and KYC (Know Your Customer) regulations
[
        <xref ref-type="bibr" rid="ref18 ref26 ref27">18, 26, 27</xref>
        ]. Although we were helped by domain experts in
understanding the business domain of each regulation, it was a complex
task to encode formal rules from the natural language regulation
text, both with and without the help of (semi-) automated
extraction [
        <xref ref-type="bibr" rid="ref26 ref27">26, 27</xref>
        ]. Following our own approach of (semi-) automated
extraction [
        <xref ref-type="bibr" rid="ref26 ref27">26, 27</xref>
        ], it took several iterations before we got the rule
hierarchy and parameters of predicates right. Most importantly, it
was hard to pinpoint rules modularly in the regulation text. By
modularly, we mean that our rule extraction process [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ], based on the
generation of a domain model and a dictionary [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ], enabled us to
classify the legal text sentences into those that pertain to
regulatory rules and those that do not. Unaware of the greater structure
in which a regulatory body organizes the regulations, we ended up
overlooking some critical rules that relate to such organization. It is
in this context that the work presented in this paper becomes relevant.
Examples of these problems are provided in the case study section.
      </p>
      <p>
        This paper presents an approach to address the challenge of
building structured rule bases for large regulations guided by a generic
semantic model. We use our approach for (semi-) automated
extraction of a domain model, dictionary, and rule suggestions to get to
the rules [
        <xref ref-type="bibr" rid="ref26 ref27">26, 27</xref>
        ]. The generic model also serves as a hierarchical
template for creating the rule base, wherein the domain expert fills in
extracted information to create rules in a controlled natural language.
The template helps create a coherent knowledge base of rules with
an inference hierarchy that makes reasoning about higher-level goals
      </p>
      <p>Generic
regulation rules
Generic regulation</p>
      <p>concepts
Information</p>
      <p>Extraction
Extracted rule
suggestions</p>
      <p>Extracted
instances</p>
      <p>Generic semantic
model for regulations</p>
      <p>
        Rule
template
of the regulation possible. Most importantly, the template gives a
skeletal structure that ensures inclusion of principal categories of
rules. The rules are translated automatically into a Semantics of
Business Vocabulary and Rules (SBVR) model1 and further into a
defeasible logic formalism DR-Prolog[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] as we detailed in [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ].
      </p>
      <p>Our overall approach is depicted in Figure 1. We first briefly
review our (semi-) automated extraction approach in Section 2,
followed by the description of the generic semantic model in Section 2.
Section 3 describes rule base creation and querying for compliance,
Section 4 discusses the utility of our approach, Section 5 describes
related work, and Section 6 concludes the paper. We illustrate our
approach using a real-life case study from the MiFID-2 regulation
applicable in the European Union (EU).
2</p>
    </sec>
    <sec id="sec-2">
      <title>OUR APPROACH FOR RULE EXTRACTION</title>
      <p>
        In this section, we rfist review the generation of the domain model
and dictionary. We also go over the creation of a classifier that
uses these artifacts to classify legal text sentences into those that
contribute to rules and those that do not. We then proceed to elaborate
on our approach for (semi-) automated creation of hierarchical rule
bases. Note that we only expound the key ideas without restating
the results already published in [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]. We proceed by revisiting the
motivation behind the domain modeling first.
2.1
      </p>
    </sec>
    <sec id="sec-3">
      <title>Domain Modeling for Regulatory Compliance</title>
      <p>In our engagements and interactions with the domain experts from
enterprises active in the banking and financial services, we found
that the domain experts would encode their knowledge in the form of
descriptive artifacts, within which they would establish some form
of traceability. But in most cases, the backbone of this activity was
a mental model of the regulation, which the domain experts had to
somehow corroborate with the artifacts, that the governance, risk,
and compliance (GRC) frameworks or the in-house solutions would
let them create. However, the solutions did not offer the domain
experts a way to formalize their knowledge.</p>
      <p>1Semantics of Business Vocabulary and Business Rules, http://www.omg.org/spec/
SBVR/1.2/
Semantic model of
specific regulation</p>
      <p>Translation Regulation
rule base</p>
      <p>
        A domain model and a dictionary of the concepts in the model
could be used as the central artifacts to drive the compliance process,
giving the domain experts a more principled way of managing
compliance. Such a domain model would be also helpful, if one were to
introduce the benefits of formal compliance checking in an industry
setting [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. This observation led us to come up with a method and
a tool for generation of a domain model and a dictionary, detailed in
[
        <xref ref-type="bibr" rid="ref27">27</xref>
        ], revisited below.
      </p>
      <p>
        Using Distributional Semantics for Building Domain Model
and Dictionary Instead of using natural language processing (NLP)
for syntactic analysis of legal text, we chose to use NLP to
implement distributional semantics in the process of building the domain
model and the dictionary. Most of the state of the art NLP approaches
in creating domain models or ontologies rely on syntactic features of
the tokens in the text [
        <xref ref-type="bibr" rid="ref26 ref27">26, 27</xref>
        ]. These approaches tend to use
heuristics, for instance, every noun phrase is a candidate for a concept,
every verb phrase is a candidate for a relation, and every adjective
is a candidate for a characteristic of a concept, etc. In our
experience, such approaches are feasible, when a) the sentences in the
given text are small2, (b) the sentences possess simple phrasal and
clausal structures that do not lead to multiple parses, and c) the
overall number of sentences in the text under consideration is few
hundreds of sentences. For several hundreds of long and complex
sentences3, which is the usual case in business domains like banking
and financial regulations 4,5, we needed to use techniques that did
not specifically depend on the syntactic features for constructing the
domain models.
      </p>
      <p>
        We chose to use distributional semantics hypothesis [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] to help
the domain expert discover the domain model and the dictionary of
concepts. The distributional semantics hypothesis states that words
that occur in the same contexts tend to have similar meanings. Since
2Examples from most of these approaches contain sentences with 5-15 tokens
(words). The Penn TreeBank, on which the statistical parsers like Stanford PCFG parser
and Malt parser are trained, has sentences with average length of 25.6 tokens [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>3In our Know Your Customer (KYC) for Indian Banks case study, we found that
average length of the sentences was 31.7 tokens. For the MiFID-2 text, it is 38.27 tokens.</p>
      <p>4The number of sentences in the text offered online for KYC is 526, while in the
MiFID-2 is 4069. The sentences are obtained using heuristics based sentence detection
model and do not consider additional text from relevant documents that an enterprise
may have to consider for enacting compliance to these regulations.</p>
      <p>
        5The KYC and MiFID-2 links are provided at the end of this section.
this hypothesis is independent of syntactic features, the length or
the phrasal or clausal complexity of the sentences do not restrict
either the scope or the scale of its application. Following
observations helped us in designing the implementation of the distributional
semantics hypothesis for domain modeling:
∙ All regulations constrain the interaction of domain concepts
in some manner [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]. To do so, the text of the regulations
uses mentions of domain concepts. By getting handle on
concepts and their mentions, it becomes intuitively easy to
understand what the regulation is trying to do and how to
specify it [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ].
∙ Fact-orientation (FO), a domain modeling method used
for constructing vocabularies in SBVR [
        <xref ref-type="bibr" rid="ref13 ref22">13, 22</xref>
        ], uses the
same principle as the distributional semantics hypothesis
in its conceptual schema design procedure (CSDP). The
very first step in CSDP is transform familiar information
examples into elementary facts. When performed manually,
a modeler essentially strives to check whether the contexts
of familiar examples contain some hints to obtain concepts
and relations [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ].
      </p>
      <p>
        We refer to occurrences of the instances and the synonyms of
the domain concepts as mentions. Based on above observations, we
compute the spans of texts of a configurable length, around (both to
the left and to the right of) the mentions of the domain entities. We
cluster the contexts of each concept discovered so far, so as to find
its other mentions and the mentions of the concepts, to which it is
likely related [
        <xref ref-type="bibr" rid="ref26 ref27">26, 27</xref>
        ].
      </p>
      <p>The domain expert has the option to provide a seed set of domain
concepts and their mentions to the system, generally found in the
definitions section of the most industry regulations6, or build the
domain model from scratch, starting with a single concept and its
mention</p>
      <sec id="sec-3-1">
        <title>Using Informed Active Learning for a Rule Classifier Our</title>
        <p>
          choice of active learning technique was motivated by the fact that the
active learning process aims at keeping the domain expert annotation
effort to a minimum, only asking for advice where the training utility
of the result of such a query is high [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ].
        </p>
        <p>
          For the purpose of classification of legal text sentences, it is
possible to use features based on various n-grams (n items like letters
or words), and part of speech classes like verbs, modal auxiliaries,
word couples and so on. Such features do provide acceptable results
for detecting arguments in legal text [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]. Instead of such features,
we make use of the domain model and the dictionary obtained
previously. A dictionary-based feature is activated whenever a mention
of a domain concept is found in a given sentence. During the
active learning sessions, the role of the domain expert is essentially
to provide a judgment over classification suggested by the active
learner. The domain expert is queried for the top-k sentences one
by one in each session in a console-based application, whereby the
domain expert inputs the true class of the sentence queried by the
active learner [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ].
        </p>
        <p>
          6See the definitions section in European MiFID-2 regulations, Article 4 Definitions
at http://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:32014L0065&amp;
from=EN and the definitions section of Indian Know Your Customer regulations, Section
2 Definitions at https://rbi.org.in/scripts/BS_ViewMasCirculardetails.aspx?id=9848
The interested reader is invited to refer to [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ] and [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ] for
experimental results in both the generation of the domain model and
dictionary as well as the rule classifier.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>From Classified Rules to Structured Regulations In the fol</title>
        <p>lowing section, we describe how this approach of building a domain
model and a dictionary, and then constituting a rule classifier on
top of these, helps in hierarchical structuring of the regulations. We
describe the structure of a regulation and the regulatory compliance
problem context, then briefly outline the SBVR standard by Object
Management Group (OMG), that we use for creating a semantic
model of regulation rules. The rules in SBVR are translated to logic
form, that can be used for querying and checking compliance. SBVR
allows rules to be defined in its variant of controlled natural language
called Structured English (SE). We first create a generic rule model
for regulations that serves as a template for the domain expert to
construct the rule base for a specific regulation using extracted
information, as depicted in Figure 1. The detailed process is described in
the next few sections.
2.2</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>The Regulatory Compliance Context</title>
      <p>Regulatory bodies introduce legislation to mitigate risks faced by
individuals or enterprises. Introduction of new legislation often
involves issue of a directive that gives abstract guidelines, followed by
a regulation that makes concrete recommendations for the guidelines.
The regulatory body usually also makes available other supporting
documents such as regulatory technical specifications (RTS), and
consultation papers giving guidelines for implementation through
data and reporting formats, explanatory use case scenarios, etc.7.</p>
      <p>The directive and regulation, both define goals that aim to
mitigate risks. The regulation typically applies in conjunction with the
parent directive if it exists. Regulations always have a well-defined
scope within which they are applicable. They include detailed scope
rules defining this scope such as, entities to which the regulation
applies, conditions under which it applies, and exemption
conditions. They lay down obligations for entities that fall within the
scope. Obligations are individual regulatory rules that apply to
enterprises. Obligations are usually grouped into sections based on the
domain functions that they govern. In the prevalent manual practice
of regulatory compliance, enterprises that need to comply with the
regulation, legal and compliance experts, auditors, and even
regulators spend huge effort in understanding and interpreting the contents
of regulations in the context of enterprise compliance.</p>
      <p>If a knowledge base that encoded all the obligations of a
regulation were available, the various stakeholders would be able to
query the same, for the purpose of implementing compliance, or to
ascertain enterprises’ compliance to the regulation. Queries could
include: ’What are the goals of the regulation?’, ’What are the risks
it aims to mitigate?’, ’What is the scope of the regulation?’, ’What
sort of entities does it apply to?’, ’What are the broad groups of
obligations it describes?’, ’What are the obligations impacting
enterprises of type X?’, ’Given a certain set of data from the enterprise,
is it compliant?’. A knowledge base would make it possible to query
compliance to goals or sub-goals at various levels, to the
regulation as a whole, or to groups of obligations, as also to individual
obligations, at various stages in the compliance process.
7MiFID documents
Directive aims for
is supported by</p>
      <sec id="sec-4-1">
        <title>Regulation</title>
        <p>addresses
has scope lays down</p>
      </sec>
      <sec id="sec-4-2">
        <title>Scope</title>
      </sec>
      <sec id="sec-4-3">
        <title>Obligation</title>
        <p>applies to
Goal mitigates</p>
      </sec>
      <sec id="sec-4-4">
        <title>Risk</title>
        <p>faced by
fulfils</p>
      </sec>
      <sec id="sec-4-5">
        <title>Enterprise</title>
        <p>The key elements goals, risks, directives, regulations, scope,
obligations, and their relationships define the semantic structure of a
regulation, depicted in Figure 2. We define a generic set of
compliance rules based on this structure, and term it our generic semantic
(rule) model for regulations. We use this generic model for both
extraction of rules from the regulation text as well as a template for
creating a regulation rule base. The next section describes SBVR
and SBVR SE, used to capture our semantic rule model.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>2.3 Semantic modeling of rules using SBVR</title>
      <p>SBVR is an OMG standard that helps define a semantic model of
rules, where rules are defined as compositions of fact types. Fact
types are relations between concepts. This is called a semantic model
because the meaning of the rule is explicated through its component
facts and concepts.</p>
      <p>Since SBVR is intended to capture the vocabulary and rules of
a business domain, OMG provides a controlled natural language
notation for specifying the model, called SBVR Structured English
(SE)8</p>
      <p>Rules in SE are written by imposing modalities such as obligation
and necessity onto compositions of fact types. e.g. It is obligatory
that account has balance if customer holds account. Here, customer,
account and balance are concepts, and ’customer holds account’ and
’account has balance’ are fact types. SBVR SE being a restricted
subset of natural language, can be understood and used with ease by
domain experts. We use SBVR SE to define some generic rules for
regulations, detailed in the next subsection.</p>
    </sec>
    <sec id="sec-6">
      <title>2.4 Generic Semantic Model for Regulations</title>
      <p>The key elements of a regulation and their relations depicted in
Figure 2 are concepts and fact types in SBVR terminology, in other
words, the conceptual model of a regulation. We define generic rules
for checking compliance based on these concepts and fact types,
depicted in Listing1.</p>
      <sec id="sec-6-1">
        <title>Listing 1: Generic rules for compliance</title>
        <p>8Semantics of Business Vocabulary and Business Rules: Annex A: SBVR
Structured English, http://www.omg.org/spec/SBVR/1.2/
Compliance
meta-model
Rules gen001 and gen002 define compliance for an enterprise to the
directive and regulation respectively. Rule gen003 evaluates whether
the enterprise falls within the scope of the regulation. Rule gen004
defines the relation between goals and risks. These rules define a
generic template that can be instantiated to create a rule template
for a specific regulation, by substituting generic concepts with their
instances from the regulation text. The specific rule template is
then filled in with rules from the regulation text to create its rule
base. Instances of generic concepts, and rule suggestions are found
through automated extraction from the regulation text, using the
techniques detailed in earlier sections.</p>
        <p>We illustrate our approach of rule base creation using a subset of
the MiFID-2 regulation. The next section gives a brief description
of the regulation.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>2.5 MiFID-2 example</title>
      <p>MiFID-2 is a directive introduced in the European Union to regulate
the functioning of financial markets and bring in greater transparency
in their operation, for safeguarding the interests of customers of
investment firms. It mainly lays down obligations on investment
ifrms to report trading transactions carried out on secondary markets,
to the appropriate authorities, to enable oversight. The directive is
supported by the MiFIR regulation, an RTS, and several consultation
papers9.</p>
      <p>Broadly, the level of detail increases from directive to regulation to
RTS. We used a subset of the text from each regulatory document for
our case study. Elements of regulatory information, their document
sources, and the criteria applied in picking the subset of document
text for the case study are listed below.</p>
      <p>∙ Risks, goals, scope, definitions, and high-level obligations
from Directive. The Introduction and Article 1 (Scope and
Definition section) from the Directive were used as source
text.
∙ Scope and definitions from regulation. Article 1 (Scope and</p>
      <p>Definitions) was used as source text.
∙ Obligations from regulation. Here, we scoped the text by
selecting one high-level obligation from the directive and
picking the corresponding sections from directive,
regulation, and RTS, viz. Article 26.
∙ Detailed specification of obligations from RTS (Sections
with references to Article 26).</p>
      <p>Data definitions from regulation appendix
∙
Sample text from the directive document is reproduced here to
illustrate rule and non-rule text. The first paragraph is non-rule text
while the second paragraph gives a scope rule.</p>
      <p>The financial crisis has exposed weaknesses in the functioning
and in the transparency of financial markets. The evolution of
financial markets has exposed the need to strengthen the framework for
the regulation of markets in financial instruments, including where
trading in such markets takes place over-the-counter (OTC), in order
to increase transparency, better protect investors, reinforce
confidence, address unregulated areas, and ensure that supervisors are
granted adequate powers to fulfil their tasks......</p>
      <p>Article 1: Scope
1. This Directive shall apply to investment firms, market operators,
data reporting services providers, and third-country firms providing
investment services or performing investment activities through the
establishment of a branch in the Union....</p>
      <p>The next section describes automated extraction of these elements
from the document sources.
2.6</p>
    </sec>
    <sec id="sec-8">
      <title>Rule Extraction from Regulatory Documents</title>
      <p>In the first iteration, the generic concepts of Figure 2 and their
mentions are given as seed concepts for extraction. These are shown
in Listing2. Concepts and mentions are given as mention:CONCEPT,
and relations as CONCEPT&gt;relation&gt;CONCEPT.</p>
      <p>Listing 2: Generic concepts and mentions input to rule
extractor
1 d i r e c t i v e : DIRECTIVE d i r e c t i v e s : DIRECTIVE r i s k s : RISK r i s k
: RISK r e g u l a t i o n : REGULATION r e g u l a t i o n s : REGULATION
2 aim :GOAL goal :GOAL aims :GOAL goals :GOAL need : OBLIGATION
necessary : OBLIGATION n e c e s s i t y : OBLIGATION
requirements : OBLIGATION requirement : OBLIGATION
p o l i c y : POLICY p o l i c i e s : POLICY r e g u l a t o r y t e c h n i c a l
standards :RTS
3 scope :SCOPE o b l i g a t i o n s : OBLIGATION o b l i g a t i o n : OBLIGATION
d e f i n i t i o n s : DEFINITION d e f i n i t i o n : DEFINITION r u l e :
RULE c o n t r o l s :CONTROL
9MiFID-2: http://ec.europa.eu/finance /securities/isd/mifid2 /index_en.htm
4 competent a u t h o r i t y :REGULATOR l e g a l framework :REGULATORY
FRAMEWORK r e g u l a t o r y framework :REGULATORY</p>
      <p>FRAMEWORK
5 e n t e r p r i s e : ENTERPRISE e n t e r p r i s e s : ENTERPRISE e n t i t y :
ENTERPRISE e n t i t i e s : ENTERPRISE o r g a n i z a t i o n :
ENTERPRISE o r g a n i z a t i o n s : ENTERPRISE i n s t i t u t i o n :
ENTERPRISE i n s t i t u t i o n s : ENTERPRISE f i r m : ENTERPRISE
f i r m s : ENTERPRISE
6 DIRECTIVE&gt;aims f o r &gt;GOAL DIRECTIVE&gt; i s supported by&gt;
REGULATION GOAL&gt; m i t i g a t e s &gt;RISK REGULATION&gt;has scope
&gt;SCOPE REGULATION&gt; l a y s down&gt;OBLIGATION ENTERPRISE&gt;
f u l f i l s &gt;OBLIGATION SCOPE&gt; a p p l i e s to &gt;ENTERPRISE RISK
&gt;faced by&gt;ENTERPRISE REGULATION&gt;addresses &gt;RISK
The rule extractor is run on all the available documents, viz. directive,
regulation, and RTS. This brings up rule suggestions from the
regulatory texts that contain mentions of these key concepts. Instances of
concepts can be found in these, e.g. MiFID as instance of directive,
transparency in financial markets as instance of goal. Examples of
rule suggestions that come up in the first iteration are scope rules and
high-level obligations, due to the seed concepts scope, enterprise,
requirements and their mentions given as input. These are illustrated
in Listing 3.</p>
      <p>Listing 3: Extracted rule suggestions from MiFID-2 documents
1 / / From D i r e c t i v e
2 r u l e a5697 I t i s o b l i g a t o r y t h a t This D i r e c t i v e a p p l i e s
t o investment f i r m s market o p e r a t o r s data r e p o r t i n g
s e r v i c e s p r o v i d e r s t h i r d c o u n t r y f i r m s p r o v i d i n g
investment s e r v i c e s or performing investment
a c t i v i t i e s e s t a b l i s h m e n t o f branch i n Union
3 r u l e a1292 This D i r e c t i v e e s t a b l i s h e s requirements t o
a u t h o r i s a t i o n o p e r a t i n g c o n d i t i o n s investment f i r m s
p r o v i s i o n o f investment s e r v i c e s or a c t i v i t i e s
t h i r d c o u n t r y f i r m s .
4 / / From Regulation
5 r u l e a7790 This Regulation e s t a b l i s h e s uniform
requirements t o d i s c l o s u r e o f t r a d e data t o p u b l i c
r e p o r t i n g o f t r a n s a c t i o n s t o competent a u t h o r i t i e s
t r a d i n g o f d e r i v a t i v e s organised venues
d i s c r i m i n a t o r y access t o c l e a r i n g d i s c r i m i n a t o r y
access t o t r a d i n g i n benchmarks product
i n t e r v e n t i o n powers o f competent a u t h o r i t i e s ESMA
EBA powers o f ESMA p o s i t i o n management c o n t r o l s
p o s i t i o n l i m i t s p r o v i s i o n o f investment s e r v i c e s or
a c t i v i t i e s t h i r d c o u n t r y f i r m s or branch
The suggestions are in a format very close to Structured English.
The domain expert can use them to write SE rules with very little
editing, illustrated in the next section. In subsequent iterations,
specific concepts from obligations that need to be explicated further are
given to the rule extraction engine, to extract detailed obligations.
The next section describes the steps to create the rule base using
extracted information.
3</p>
    </sec>
    <sec id="sec-9">
      <title>RULE BASE CREATION</title>
      <p>Rule base creation using the template and extracted rule suggestions
is described here as a set of steps, illustrated in Figure 3.</p>
      <sec id="sec-9-1">
        <title>Step 1: Identify instances of generic concepts Instances of con</title>
        <p>cepts found in the rule suggestions are listed by the experts as
instances in the rule base, using is_a facts, as shown in Listing 4.</p>
        <p>Deepali Kholkar, Sagar SunkleS,teanpds iVninruayleKbualksaerni</p>
        <p>creation
Step 2: Create specific</p>
        <p>rule template
Generic regulation
rule template</p>
        <p>Specific rule
template in SE
Regulation
rule model</p>
        <p>Step 3: Define scope rules and
obligations using template
Automated
translation
Step 5: Generate logic
specification of rules</p>
        <p>Regulation
rule base</p>
        <sec id="sec-9-1-1">
          <title>Iteration 1</title>
          <p>Step 1: Identify instances of
generic concepts</p>
          <p>Domain
expert
Listing 4: Instances of generic regulation concepts extracted
from MiFID-2 documents
1 r u l e um001 MiFID is_a d i r e c t i v e
2 r u l e um002 MiFIR is_a r e g u l a t i o n
3 r u l e um003 MiFID is_supported_by MiFIR
4 r u l e um004 t r a n s p a r e n c y _ i n _ f i n a n c i a l _ m a r k e t s is_ a goal
5 r u l e um005 w e a k n e s s _ i n _ f u n c t i o n i n g _ o f _ f i n a n c i a l _ m a r k e t s
is_ a r i s k
6 r u l e um006 r e g u l a t i o n _ o f _ f i n a n c i a l _ m a r k e t s is_ a goal
7 r u l e um007 MiFIR_scope is_a r e g u l a t i o n _ s c o p e
8 r u l e um008 MiFIR aims_for</p>
          <p>t r a n s p a r e n c y _ i n _ f i n a n c i a l _ m a r k e t s
9 r u l e um009 MiFIR has_scope MiFIR_scope
10 r u l e um010 t r a n s p a r e n c y _ i n _ f i n a n c i a l _ m a r k e t s m i t i g a t e s
w e a k n e s s _ i n _ f u n c t i o n i n g _ o f _ f i n a n c i a l _ m a r k e t s
11 r u l e um011 r e g u l a t i o n _ o f _ f i n a n c i a l _ m a r k e t s m i t i g a t e s
w e a k n e s s _ i n _ f u n c t i o n i n g _ o f _ f i n a n c i a l _ m a r k e t s
12 r u l e um012 M i F I R _ o b l i g a t i o n s is_a r e g u l a t i o n _ o b l i g a t i o n s
13 r u l e um013 MiFIR aims_for</p>
          <p>r e g u l a t i o n _ o f _ f i n a n c i a l _ m a r k e t s
14 r u l e um014 MiFIR lays_down M i F I R _ o b l i g a t i o n s
15 r u l e um015 MiFIR has_scope MiFIR_scope</p>
        </sec>
      </sec>
      <sec id="sec-9-2">
        <title>Step 2: Create specific rule template</title>
        <p>The generic rule template of Listing 1 is instantiated by replacing
concept names with instance names in the rules to generate the
specific rule template for the regulation shown in Listing5.</p>
        <p>Listing 5: Specific rule template for the MiFID-2 regulation
1 / / I n s t a n t i a t e d r u l e s
2 r u l e gen001 I t i s o b l i g a t o r y t h a t e n t e r p r i s e
complies_with MiFID i f MiFID is_supported_by MiFIR
&amp;&amp; e n t e r p r i s e complies_with MiFIR
3 r u l e gen002 I t i s o b l i g a t o r y t h a t e n t e r p r i s e
complies_with MiFIR i f e n t e r p r i s e
f a l l s _ w i t h i n _ s c o p e _ o f MiFIR &amp;&amp; MiFIR lays_down
M i F I R _ o b l i g a t i o n s &amp;&amp; e n t e r p r i s e f u l f i l s
M i F I R _ o b l i g a t i o n s
4 r u l e gen003 I t i s o b l i g a t o r y t h a t e n t e r p r i s e
f a l l s _ w i t h i n _ s c o p e _ o f MiFIR i f MiFIR has_scope</p>
        <p>MiFIR_scope &amp;&amp; MiFIR_scope a p p l i e s _ t o e n t e r p r i s e
5 r u l e gen004 I t i s necessary t h a t MiFIR addresses r i s k i f</p>
        <p>MiFIR aims_for goal &amp;&amp; goal m i t i g a t e s r i s k
6 r u l e gen005 I t i s o b l i g a t o r y t h a t e n t e r p r i s e
f a l l s _ w i t h i n _ s c o p e _ o f MiFID i f MiFID
is_supported_by MiFIR &amp;&amp; e n t e r p r i s e
f a l l s _ w i t h i n _ s c o p e _ o f MiFIR</p>
      </sec>
      <sec id="sec-9-3">
        <title>Step 3: Define scope rules and obligations using the template</title>
        <p>The specific template contains placeholders for scope rules and
obligations, in rules gen003 and gen002, in the predicates MiFIR
has_scope MiFIR_scope, MiFIR_scope applies_to enterprise,
MiFIR lays_down MiFIR_obligations, and enterprise fulfils MiFIR
obligations respectively. These need to be further detailed in order
to complete the definition of the rule base. Their details are obtained
from the extracted rule suggestions illustrated in Listing3. Rules
written using the suggestions can be seen in Listing6, with the same
rule numbers.</p>
        <p>Listing 6: Rule base with scope rules and obligations
1 r u l e a5697 I t i s o b l i g a t o r y t h a t MiFIR_scope i s _ f o r
e n t e r p r i s e i f e n t e r p r i s e is _a i n v e s t m e n t _ f i r m | |
e n t e r p r i s e is_ a regulated_markets | | e n t e r p r i s e
is_ a r e p o r t i n g _ f i r m | | e n t e r p r i s e is _a
t h i r d _ c o u n t r y _ i n v e s t m e n t _ f i r m s _ o p e r a t i n g _ i n _ E U &amp;&amp;
e n t e r p r i s e has_established branch_in_EU
2 r u l e a7790 I t i s o b l i g a t o r y t h a t e n t e r p r i s e f u l f i l s
MiFIR_requirements i f e n t e r p r i s e f u l f i l s
r e q u i r e m e n t s _ f o r d i s c l o s u r e _ o f _ t r a d e _ d a t a _ t o _ p u b l i c
&amp;&amp;
3 e n t e r p r i s e f u l f i l s r e q u i r e m e n t s _ f o r _ r e p o r t i n g _ o f
t r a n s a c t i o n s &amp;&amp; e n t e r p r i s e f u l f i l s
r e q u i r e m e n t s _ f o r _ t r a d i n g _ o f _ d e r i v a t i v e s _ o n
organised_venues &amp;&amp;
4 e n t e r p r i s e f u l f i l s r e q u i r e m e n t s _ f o r _ n o n _ d i s c r i m i n a t o r y
a c c e s s _ t o _ c l e a r i n g &amp;&amp; e n t e r p r i s e f u l f i l s
r e q u i r e m e n t s _ f o r
non_discriminatory_access_to_trading_benchmarks &amp;&amp;
e n t e r p r i s e f u l f i l s r e q u i r e m e n t s _ f o r _ p r o d u c t
i n t e r v e n t i o n _ p o w e r s &amp;&amp;
5 e n t e r p r i s e f u l f i l s</p>
        <p>r e q u i r e m e n t s _ f o r _ a c t i v i t i e s _ b y _ t h i r d _ p a r t y _ f i r m s
rule a5697 defines MiFIR_scope is_for enterprise in terms of the
scope rule rule a5697 extracted from the directive text, that details
the kinds of enterprises the regulation applies to. rule a7790 defines
enterprise fulfils MiFIR_requirements as a set of high-level
obligations given in the regulation, again obtained from the extracted rule
suggestion.</p>
        <p>The obligation rules are detailed further, using suggestions
extracted from regulation or RTS documents. e.g. rule a8033, a9259,
a8133, a8233 define enterprise fulfils requirements_for_reporting_of
transactions using a hierarchy of rules expressing obligations, shown
in Listing 7. Concepts from the obligations are then given as input to
the rule extractor to extract the next set of rule suggestions, that can
be used to detail obligations still further. This process is iteratively
followed.</p>
      </sec>
      <sec id="sec-9-4">
        <title>Step 4: Generate logic specification of rules The SE rules written</title>
        <p>
          by the domain expert are automatically translated into an SBVR
model of rules. The description of this work is outside the scope
of this paper. Rules in SBVR are translated to defeasible logic
formalism DR-Prolog[
          <xref ref-type="bibr" rid="ref1">1</xref>
          ], using the translation mechanism described in
[
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. The translated rules in DR-Prolog are shown in Listing8.
        </p>
        <p>
          These rules can be directly checked against enterprise data facts.
It is seen from the listing in DR-Prolog that the lowest-level rules
contain data definitions. Our earlier work [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] deals with the
process of arriving at these rules, as well as the necessary enterprise
data facts, hence it is not explained here. The term obligations has
been used throughout the text to mean rules that have any of the
modalities obligation, permission, prohibition and necessity. Each
of these modalities has been implemented using the defeasible logic
metaprogram of DR-Prolog[
          <xref ref-type="bibr" rid="ref1">1</xref>
          ], which provides an implementation
for each modality using the constructs available in standard Prolog.
In the next section, we discuss whether the described approach meets
its objectives.
4
        </p>
      </sec>
    </sec>
    <sec id="sec-10">
      <title>DISCUSSION</title>
      <p>
        One of the key objectives of using the template based approach is
isomorphism, i.e. imparting a structure to the rule base similar to
that of the original regulation document sources, discussed in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
The generic concept model we have used to guide the extraction
refers to the regulatory sources, viz. directive and regulation, and
embodies the structure of a regulation. This structure and
traceability is maintained right from initial rule specification in Structured
English to the model and formal rule specification in DR-Prolog.
In the absence of a process such as this, the onus is on experts to
ensure both structure and correctness. Structuring and achieving
isomorphism becomes subjective. A manual process of rule
construction goes through several iterations to achieve correctness in
the rule hierarchy. Our generic model-guided extraction seeks to
avoid omission of key elements and reduce errors through automated
generation of the rule hierarchy, predicates, and parameters. This
takes away the complexity associated with manual construction of
formal logic rules. Since extracted rules depend completely on seed
concepts given, which may vary from user to user, guided extraction
with the generic set of seed concepts gives uniformity and assurance.
An example of omission during our earlier manual rule base creation
experiment is that we had missed encoding scope rules regarding
enterprises to which the MiFID-2 regulation is applicable.
      </p>
      <p>
        Reduced burden on domain experts and faster knowledge
engineering seem to justify the development cost of our rule generation
framework. However, empirical evidence is crucial to support this
claim. We are in the process of conducting a systematic
empirical evaluation of our approach. It must be mentioned here that
the problems mentioned in the pioneering and extensive work on
encoding regulations in formal logic [
        <xref ref-type="bibr" rid="ref23 ref5">5, 23</xref>
        ] such as need for
simpliifcation when encoding legislations and handling cross-references
within regulations, remain. We have dealt with some of these such
as simplification of complex sentence constructs, bulleted lists, and
cross-references in our work on extraction.
      </p>
      <p>The important objective of having a formal rule base is being able
to answer queries about the regulation. Using our resultant rule base
structure, we are able to answer the queries listed in Section 2.2 with
regard to goals addressed by the regulation, the kinds of enterprises
it applies to, as well as compliance of all or specific entities whose
ground data is provided, to all or specific rules.</p>
      <p>Pros of our approach include automated extraction of necessary
supporting rules such as investment_firm executes transaction and
transaction trades financial_instrument , that were not retrieved even
in non-guided extraction. We are currently testing this hypothesis
on larger examples. We have not so far encountered any problems
of using automated rule generation. A difference from the manual
encoding approach is that the user needs to familiarize himself
with generated rule names and some indirections in the generated
rules when tracing rule execution during compliance checking. Rule
expressiveness in our approach is adequate and scales well, since
SBVR has a very rich meta-model with a direct correspondence to
SBVR Structured English. Being fact-oriented, SBVR maps directly
to DR-Prolog, which scales for large datasets as well.</p>
      <p>The generic model described here is a little basic, but can and
should be altered as required, if the regulation being worked on
has a different structure or important sections or elements that need
to be incorporated into the structure of the rule base. We plan to
also enhance the model using the learning from our experience
of applying the approach to several regulations. The next section
reviews related work.
5</p>
    </sec>
    <sec id="sec-11">
      <title>RELATED WORK</title>
      <p>
        We survey work with similar objectives as ours, of structured
representation of regulatory content to allow formal checking and analysis.
Encoding the British Nationality Act as a logic program in Prolog
[
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] was pioneering work in encoding regulations in formal logic,
as is [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. A need for intermediate representation of natural language
regulations was underlined in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. These and later formal approaches
to encoding regulations [
        <xref ref-type="bibr" rid="ref11 ref3">3, 11</xref>
        ] require regulation rules to be encoded
manually in the logic formalism.
      </p>
      <p>
        The important requirement of integrating compliance checking
and accessing related regulatory documents is addressed in [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
Listing 7: Rule base with detailed obligations
They build a compliance assistance framework that uses first-order
logic (FOL) representations of regulation rules, as well as related
questions and answers. The FOL rules are linked to their occurrences
in regulation documents through an XML representation of tagged
regulation information. Writing of the FOL rules and tagging to
regulation text is done manually. Our endeavour is semi-automated
creation of a rule base of the entire regulation, and logical structuring
of the rule base to be able to reason about compliance to the
higherlevel goals of the regulation.
      </p>
      <p>
        Most of the current state of the art in legal rule extraction contains
an implicit step of rule identification. This step often encompasses
several other constituent steps, like identifying segmentation of
regulations, ascertaining modality of the regulations such as whether
the rule is an obligation or a permission, and so on [
        <xref ref-type="bibr" rid="ref28 ref29 ref30 ref7">7, 28–30</xref>
        ]. The
ifnal constituent step is often writing the chosen logical specification
of NL rule. We believe that by separating these concerns from rule
identification, it is possible to defer their treatment until we obtain
logical specifications, albeit partial ones. We believe that the logical
specification language itself, such as for instance DR-Prolog [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]
can be used to take care of the segmentation and cross referencing,
because at that level of abstraction, we already have access to schema
of the information required for regulations.
      </p>
      <p>
        In contrast, the current state of the art often focuses on the
treatment of legal syntactical specifics early on. This is evident in the
governance extraction model which manually classifies and attempts
to extract regulations as legal requirements in terms of procedural,
declarative, ontology statements [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Further fine level classification
includes access-rights statements and delegation of authority rights.
Another work proposes to group legal sentences into few categories
referred to as juridical natural language constructs (JNLCs). JNLCs
are proposed to be parsed using unification grammars [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ]. In a
similar vein, legal concepts are proposed to be classified into rights,
obligations, privileges, no-rights, powers, liabilities, immunities, and
disabilities using a production rule model in [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. Another work
proposes to use a categorization of provisions and an ML classifier
trained to identify the provisions in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        GaiusT, a tool based on Cerno framework and related research
presents semi-automated rule extraction with precision and recall
numbers similar to ours [
        <xref ref-type="bibr" rid="ref19 ref30 ref31 ref7 ref8">7, 8, 19, 30, 31</xref>
        ]. In contrast to our approach,
this tool and the framework use a number of intermediate artifacts,
namely form simplification through semantic parameterization [
        <xref ref-type="bibr" rid="ref7 ref8">7,
8</xref>
        ], structural comprehension [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ], as well as semantic annotation
[
        <xref ref-type="bibr" rid="ref30">30</xref>
        ]. Their approach seems to be restricted in applicability as well,
since all of the above activities have to be performed for any new
regulation to which they apply the framework. It is likely that when
the regulation is large like MiFID-2 or FATCA10, the interaction
between various components becomes hard to handle. At the same
time, this approach presents some ideas around a conceptual (meta-)
model of deontic concepts and a rules generator which could be of
use to us.
      </p>
      <p>
        An approach presented in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] contrasts ML-based text
classification with knowledge engineering-based (KE-based) text
classiifcation. The idea behind KE is that definitions of legal terms are
formulated using specific phrases and presuming that only a few
clear and easily observable pattens were used for each type of legal
sentences or provisions, then these so called classification patterns
could be used for classification. It was found that such approach is
susceptible to the same complexities of legal sentences which also
affect ML-based classification negatively. Some of these complexities
are classification keywords appearing in auxiliary sentences rather
than the main sentence to be classified, missing standard phrases,
syntactical and lexical variation in the standard phrases and so on.
In our own experiments, we too included certain phrases as
indicating definitional regulations (those regulations which define domain
entities and their specializations) as well as rules in the approximate
dictionary chunker. Like the results in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], our own experiments
indicated that there are no perceptible differences in the performance of
the classifier when these additional phrases are considered as well. In
contrast to [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], if we liken our approach to KE-based classification,
then we go one step further and actually make use of the domain
model entities and the dictionary in the classification.
      </p>
      <p>10FATCA: Foreign Account Tax Compliance Act, https://www.irs.gov/businesses/
corporations/foreign-account-tax-compliance-act-fatca
Listing 8: A section of the translated rules in DR-Prolog</p>
    </sec>
    <sec id="sec-12">
      <title>6 CONCLUSIONS AND FUTURE WORK</title>
      <p>We used a generic conceptual model of regulations to extract specific
concepts, relations, and rules from the regulation text. This gave us
a better directed approach to rule extraction and more structured
rule suggestions. We used a generic model of regulation rules based
on the conceptual model as a template for the regulation rule base
and found it gave us a method, and a structured rule base with less
rework, as well as some assurances on inclusion of vital sections
of information about the regulation. It created a rule hierarchy that
helps reason all the way from ground data to high-level goals of
the regulation. We believe this principled approach gives us a more
accurate and functional model of the regulation.</p>
      <p>We have experimented with using this generic concept-driven
extraction on sections of the KYC regulation. We plan to further test
the method on the entire KYC regulation and two more regulations,
and enhance the generic model and template as required. We also
plan to conduct an empirical study comparing this approach of rule
base construction to the manual one, as well as to the conventional
industry approach to compliance.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Grigoris</given-names>
            <surname>Antoniou</surname>
          </string-name>
          , Nikos Dimaresis, and
          <string-name>
            <given-names>Guido</given-names>
            <surname>Governatori</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>A System for Modal and Deontic Defeasible Reasoning</article-title>
          .
          <source>In AI 2007: Advances in Artificial Intelligence, 20th Australian Joint Conference on Artificial Intelligence</source>
          , Gold Coast, Australia, December 2-
          <issue>6</issue>
          ,
          <year>2007</year>
          , Proceedings.
          <fpage>609</fpage>
          -
          <lpage>613</lpage>
          . https://doi.org/10. 1007/978-3-
          <fpage>540</fpage>
          -76928-6_
          <fpage>62</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Grigoris</given-names>
            <surname>Antoniou</surname>
          </string-name>
          , Nikos Dimaresis, and
          <string-name>
            <given-names>Guido</given-names>
            <surname>Governatori</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>A modal and deontic defeasible reasoning system for modelling policies and multi-agent systems</article-title>
          .
          <source>Expert Syst. Appl</source>
          .
          <volume>36</volume>
          ,
          <issue>2</issue>
          (
          <year>2009</year>
          ),
          <fpage>4125</fpage>
          -
          <lpage>4134</lpage>
          . https://doi.org/10.1016/j. eswa.
          <year>2008</year>
          .
          <volume>03</volume>
          .009
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Ahmed</given-names>
            <surname>Awad</surname>
          </string-name>
          , Sergey Smirnov, and
          <string-name>
            <given-names>Mathias</given-names>
            <surname>Weske</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Resolution of Compliance Violation in Business Process Models: A Planning-Based Approach</article-title>
          .
          <source>OTM Conferences (1)</source>
          <year>2009</year>
          :
          <fpage>6</fpage>
          -
          <lpage>23</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Trevor</surname>
            <given-names>J. M.</given-names>
          </string-name>
          <string-name>
            <surname>Bench-Capon</surname>
            and
            <given-names>Frans</given-names>
          </string-name>
          <string-name>
            <surname>Coenen</surname>
          </string-name>
          .
          <year>1992</year>
          .
          <article-title>Isomorphism and legal knowledge based systems</article-title>
          .
          <source>Artif. Intell. Law 1</source>
          ,
          <issue>1</issue>
          (
          <year>1992</year>
          ),
          <fpage>65</fpage>
          -
          <lpage>86</lpage>
          . https://doi.org/ 10.1007/BF00118479
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Trevor</surname>
            <given-names>J. M.</given-names>
          </string-name>
          <string-name>
            <surname>Bench-Capon</surname>
            ,
            <given-names>G. O.</given-names>
          </string-name>
          <string-name>
            <surname>Robinson</surname>
            , Tom Routen, and
            <given-names>Marek J.</given-names>
          </string-name>
          <string-name>
            <surname>Sergot</surname>
          </string-name>
          .
          <year>1987</year>
          .
          <article-title>Logic Programming for Large Scale Applications in Law: A Formalisation of Supplementary Benefit Legislation</article-title>
          .
          <source>In Proceedings of the First International Conference on Artificial Intelligence and Law</source>
          , ICAIL '
          <fpage>87</fpage>
          , Boston, MA, USA, May
          <volume>27</volume>
          -29,
          <year>1987</year>
          .
          <fpage>190</fpage>
          -
          <lpage>198</lpage>
          . https://doi.org/10.1145/41735.41757
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Carlo</given-names>
            <surname>Biagioli</surname>
          </string-name>
          , Enrico Francesconi, Andrea Passerini, Simonetta Montemagni, and
          <string-name>
            <given-names>Claudia</given-names>
            <surname>Soria</surname>
          </string-name>
          .
          <year>2005</year>
          .
          <article-title>Automatic Semantics Extraction In Law Documents</article-title>
          .
          <source>In ICAIL, June</source>
          <volume>6</volume>
          -11,
          <year>2005</year>
          , Italy.
          <fpage>133</fpage>
          -
          <lpage>140</lpage>
          . https://doi.org/10.1145/1165485.1165506
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Travis</surname>
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Breaux</surname>
            and
            <given-names>Annie I.</given-names>
          </string-name>
          <string-name>
            <surname>Antón</surname>
          </string-name>
          .
          <year>2005</year>
          .
          <article-title>Deriving Semantic Models from Privacy Policies</article-title>
          .
          <source>In 6th POLICY Workshop</source>
          , Sweden.
          <fpage>67</fpage>
          -
          <lpage>76</lpage>
          . https://doi.org/10. 1109/POLICY.
          <year>2005</year>
          .12
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Travis</surname>
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Breaux</surname>
            ,
            <given-names>Annie I. Antón</given-names>
          </string-name>
          , and
          <string-name>
            <given-names>Jon</given-names>
            <surname>Doyle</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Semantic parameterization: A process for modeling domain descriptions</article-title>
          .
          <source>ACM Trans. Softw. Eng. Methodol</source>
          .
          <volume>18</volume>
          ,
          <issue>2</issue>
          (
          <year>2008</year>
          ). https://doi.org/10.1145/1416563.1416565
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Emile de Maat</surname>
            , Kai Krabben, and
            <given-names>Radboud</given-names>
          </string-name>
          <string-name>
            <surname>Winkels</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Machine Learning Versus Knowledge Based Classification of Legal Texts</article-title>
          .
          <source>In Proceedings JURIX</source>
          <year>2010</year>
          . IOS Press, Amsterdam, The Netherlands, The Netherlands,
          <fpage>87</fpage>
          -
          <lpage>96</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Mauro</surname>
            <given-names>Dragoni</given-names>
          </string-name>
          , Guido Governatori, and
          <string-name>
            <given-names>Serena</given-names>
            <surname>Villata</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <source>Automated Rules Generation from Natural Language Legal Texts. In Workshop on Automated Detection, Extraction and Analysis of Semantic Information in Legal Texts</source>
          . San Diego, USA,
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Guido</given-names>
            <surname>Governatori</surname>
          </string-name>
          and
          <string-name>
            <given-names>Antonino</given-names>
            <surname>Rotolo</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>A conceptually rich model of business process compliance</article-title>
          .
          <source>In APCCM</source>
          <year>2010</year>
          :
          <fpage>3</fpage>
          -
          <lpage>12</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Jan</surname>
            <given-names>Hajic</given-names>
          </string-name>
          , Massimiliano Ciaramita, Richard Johansson, Daisuke Kawahara, Maria Antònia Martí, Lluís Màrquez, Adam Meyers, Joakim Nivre,
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Padó</surname>
          </string-name>
          , Jan Stepánek, Pavel Stranák, Mihai Surdeanu, Nianwen Xue,
          <string-name>
            <given-names>and Yi</given-names>
            <surname>Zhang</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>The CoNLL-2009 Shared Task: Syntactic and Semantic Dependencies in Multiple Languages</article-title>
          .
          <source>In Proceedings of the Thirteenth Conference on Computational Natural Language Learning: Shared Task</source>
          ,
          <source>CoNLL</source>
          <year>2009</year>
          , Boulder, Colorado, USA, June 4,
          <year>2009</year>
          .
          <fpage>1</fpage>
          -
          <lpage>18</lpage>
          . http://aclweb.org/anthology/W/W09/W09-1201.pdf
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Terry</surname>
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Halpin</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Fact-Orientation and Conceptual Logic</article-title>
          .
          <source>In Proceedings EDOC</source>
          <year>2011</year>
          , Finland.
          <fpage>14</fpage>
          -
          <lpage>19</lpage>
          . https://doi.org/10.1109/EDOC.
          <year>2011</year>
          .28
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Zellig</given-names>
            <surname>Harris</surname>
          </string-name>
          .
          <year>1954</year>
          .
          <article-title>Distributional structure</article-title>
          .
          <source>Word</source>
          <volume>10</volume>
          ,
          <issue>23</issue>
          (
          <year>1954</year>
          ),
          <fpage>146</fpage>
          -
          <lpage>162</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Waël</given-names>
            <surname>Hassan</surname>
          </string-name>
          and
          <string-name>
            <given-names>Luigi</given-names>
            <surname>Logrippo</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Governance Requirements Extraction Model for Legal Compliance Validation</article-title>
          .
          <source>In RELAW</source>
          <year>2009</year>
          , USA. 7-
          <fpage>12</fpage>
          . https: //doi.org/10.1109/RELAW.
          <year>2009</year>
          .4
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>Shawn</given-names>
            <surname>Kerrigan</surname>
          </string-name>
          and
          <string-name>
            <surname>Kincho H. Law</surname>
          </string-name>
          .
          <year>2003</year>
          .
          <article-title>Logic-Based Regulation ComplianceAssistance</article-title>
          .
          <source>In Proceedings of the 9th International Conference on Artificial Intelligence and Law</source>
          ,
          <string-name>
            <surname>ICAIL</surname>
          </string-name>
          <year>2003</year>
          , Edinburgh, Scotland,
          <string-name>
            <surname>UK</surname>
          </string-name>
          , June 24-28,
          <year>2003</year>
          .
          <fpage>126</fpage>
          -
          <lpage>135</lpage>
          . https://doi.org/10.1145/1047788.1047820
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Deepali</surname>
            <given-names>Kholkar</given-names>
          </string-name>
          , Sagar Sunkle, and
          <string-name>
            <given-names>Vinay</given-names>
            <surname>Kulkarni</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>From Natural-language Regulations to Enterprise Data using Knowledge Representation and Model Transformations</article-title>
          .
          <source>In Proceedings of the 11th International Joint Conference on Software Technologies (ICSOFT</source>
          <year>2016</year>
          )
          <article-title>- Volume 2: ICSOFT-</article-title>
          <string-name>
            <surname>PT</surname>
          </string-name>
          , Lisbon, Portugal,
          <source>July 24 - 26</source>
          ,
          <year>2016</year>
          .
          <fpage>60</fpage>
          -
          <lpage>71</lpage>
          . https://doi.org/10.5220/0006002600600071
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Deepali</surname>
            <given-names>Kholkar</given-names>
          </string-name>
          , Sagar Sunkle, and
          <string-name>
            <given-names>Vinay</given-names>
            <surname>Kulkarni</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Towards Automated Generation of Regulation Rule Bases using MDA</article-title>
          .
          <source>In MODELSWARD 2017 - Accepted.</source>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Nadzeya</surname>
            <given-names>Kiyavitskaya</given-names>
          </string-name>
          , Nicola Zeni,
          <string-name>
            <given-names>Travis D.</given-names>
            <surname>Breaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Annie I. Antón</given-names>
            ,
            <surname>James R. Cordy</surname>
          </string-name>
          , Luisa Mich,
          <string-name>
            <given-names>and John</given-names>
            <surname>Mylopoulos</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Automating the Extraction of Rights and Obligations for Regulatory Compliance</article-title>
          .
          <source>In Conceptual Modeling - ER</source>
          .
          <fpage>154</fpage>
          -
          <lpage>168</lpage>
          . https://doi.org/10.1007/978-3-
          <fpage>540</fpage>
          -87877-3_
          <fpage>13</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Jeremy</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Maxwell</surname>
            and
            <given-names>Annie I.</given-names>
          </string-name>
          <string-name>
            <surname>Antón</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>The Production Rule Framework: Developing a Canonical Set of Software Requirements for Compliance with Law</article-title>
          .
          <source>In Proceedings of the 1st ACM International Health Informatics Symposium (IHI '10)</source>
          . ACM, New York, NY, USA,
          <fpage>629</fpage>
          -
          <lpage>636</lpage>
          . https://doi.org/10.1145/1882992. 1883092
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Marie-Francine</surname>
            <given-names>Moens</given-names>
          </string-name>
          , Erik Boiy, Raquel Mochales Palau, and
          <string-name>
            <given-names>Chris</given-names>
            <surname>Reed</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Automatic Detection of Arguments in Legal Texts (ICAIL '07)</article-title>
          . ACM, New York, NY, USA,
          <fpage>225</fpage>
          -
          <lpage>230</lpage>
          . https://doi.org/10.1145/1276318.1276362
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>Sjir</given-names>
            <surname>Nijssen</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>SBVR: Semantics for business</article-title>
          . (
          <year>2007</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Marek</surname>
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Sergot</surname>
          </string-name>
          , Fariba Sadri, Robert A.
          <string-name>
            <surname>Kowalski</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Kriwaczek</surname>
            ,
            <given-names>Peter</given-names>
          </string-name>
          <string-name>
            <surname>Hammond</surname>
            , and
            <given-names>H. T.</given-names>
          </string-name>
          <string-name>
            <surname>Cory</surname>
          </string-name>
          .
          <year>1986</year>
          .
          <article-title>The British Nationality Act as a Logic Program</article-title>
          .
          <source>Commun. ACM 29</source>
          ,
          <issue>5</issue>
          (
          <year>1986</year>
          ),
          <fpage>370</fpage>
          -
          <lpage>386</lpage>
          . https://doi.org/10.1145/5689.5920
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>Burr</given-names>
            <surname>Settles</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Active Learning Literature Survey</article-title>
          .
          <source>Computer Sciences Technical Report 1648</source>
          . University of Wisconsin-Madison.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Sagar</surname>
            <given-names>Sunkle</given-names>
          </string-name>
          , Deepali Kholkar, and
          <string-name>
            <given-names>Vinay</given-names>
            <surname>Kulkarni</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Model-driven regulatory compliance: A case study of “Know Your Customer” regulations</article-title>
          .
          <source>In 18th ACM/IEEE International Conference on Model Driven Engineering Languages and Systems, MoDELS</source>
          <year>2015</year>
          , Ottawa, ON, Canada,
          <source>September 30 - October 2</source>
          ,
          <year>2015</year>
          .
          <fpage>436</fpage>
          -
          <lpage>445</lpage>
          . https://doi.org/10.1109/MODELS.
          <year>2015</year>
          .7338275
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Sagar</surname>
            <given-names>Sunkle</given-names>
          </string-name>
          , Deepali Kholkar, and
          <string-name>
            <given-names>Vinay</given-names>
            <surname>Kulkarni</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Comparison and Synergy Between Fact-Orientation and Relation Extraction for Domain Model Generation in Regulatory Compliance</article-title>
          . In Conceptual Modeling - 35th International Conference, ER 2016, Gifu, Japan,
          <source>November 14-17</source>
          ,
          <year>2016</year>
          ,
          <source>Proceedings (Lecture Notes in Computer Science)</source>
          , Isabelle Comyn-Wattiau, Katsumi Tanaka,
          <string-name>
            <surname>Il-Yeol</surname>
            <given-names>Song</given-names>
          </string-name>
          ,
          <source>Shuichiro Yamamoto, and Motoshi Saeki (Eds.)</source>
          , Vol.
          <volume>9974</volume>
          .
          <fpage>381</fpage>
          -
          <lpage>395</lpage>
          . https://doi.org/10.1007/978-3-
          <fpage>319</fpage>
          -46397-1_
          <fpage>29</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Sagar</surname>
            <given-names>Sunkle</given-names>
          </string-name>
          , Deepali Kholkar, and
          <string-name>
            <given-names>Vinay</given-names>
            <surname>Kulkarni</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Informed Active Learning to Aid Domain Experts in Modeling Compliance</article-title>
          .
          <source>In 20th IEEE International Enterprise Distributed Object Computing Conference, EDOC 2016</source>
          , Vienna, Austria, September 5-
          <issue>9</issue>
          ,
          <year>2016</year>
          ,
          <string-name>
            <given-names>Florian</given-names>
            <surname>Matthes</surname>
          </string-name>
          , Jan Mendling, and
          <string-name>
            <surname>Stefanie</surname>
          </string-name>
          Rinderle-Ma (Eds.).
          <source>IEEE Computer Society</source>
          ,
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          . https://doi.org/10.1109/ EDOC.
          <year>2016</year>
          .7579382
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Tom</surname>
            <given-names>M. van Engers</given-names>
          </string-name>
          ,
          <source>Ron van Gog, and Kamal Sayah</source>
          .
          <source>2004. A Case Study on Automated Norm Extraction. In Legal Knowledge and Information Systems. Jurix</source>
          <year>2004</year>
          :
          <article-title>The Seventeenth Annual Conference. (Frontiers in Artificial Intelligence</article-title>
          and Applications), T. Gordon (Ed.). IOS Press, Amsterdam,
          <fpage>49</fpage>
          -
          <lpage>58</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>Adam</surname>
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Wyner</surname>
            and
            <given-names>Wim</given-names>
          </string-name>
          <string-name>
            <surname>Peters</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>On Rule Extraction from Regulations</article-title>
          .
          <source>In Legal Knowledge and Information Systems - JURIX</source>
          <year>2011</year>
          :
          <article-title>The Twenty-Fourth Annual Conference</article-title>
          , University of Vienna, Austria,
          <fpage>14th</fpage>
          -
          <lpage>16th</lpage>
          December
          <year>2011</year>
          .
          <fpage>113</fpage>
          -
          <lpage>122</lpage>
          . https://doi.org/10.3233/978-1-
          <fpage>60750</fpage>
          -981-3-113
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Nicola</surname>
            <given-names>Zeni</given-names>
          </string-name>
          , Nadzeya Kiyavitskaya, Luisa Mich,
          <string-name>
            <surname>James R. Cordy</surname>
            ,
            <given-names>and John</given-names>
          </string-name>
          <string-name>
            <surname>Mylopoulos</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>GaiusT Supporting The Extraction Of Rights And Obligations For Regulatory Compliance</article-title>
          .
          <source>Requir. Eng</source>
          .
          <volume>20</volume>
          ,
          <issue>1</issue>
          (
          <year>2015</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>22</lpage>
          . https://doi.org/10. 1007/s00766-013-0181-8
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>Nicola</surname>
            <given-names>Zeni</given-names>
          </string-name>
          , Luisa Mich, John Mylopoulos, and
          <string-name>
            <surname>James</surname>
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Cordy</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Applying GaiusT For Extracting Requirements From Legal Documents</article-title>
          . In Sixth International Workshop on Requirements Engineering and Law,
          <string-name>
            <surname>RELAW</surname>
          </string-name>
          <year>2013</year>
          ,
          <issue>16</issue>
          <year>July</year>
          ,
          <year>2013</year>
          , Rio de Janeiro, Brasil.
          <fpage>65</fpage>
          -
          <lpage>68</lpage>
          . https://doi.org/10.1109/RELAW.
          <year>2013</year>
          . 6671349
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>