<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Designing a Controlled Natural Language for the Representation of Legal Norms</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Stefan Hoe er</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alexandra Bunzli</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute of Computational Linguistics, University of Zurich</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Controlled Legal German (CLG) is a controlled natural language being developed for the representation of legal norms contained in Swiss statutes and regulations. This paper discusses the main design requirements CLG faces and the strategies it applies to meet them. CLG aims at providing representations for legal norms that are both formal and can be easily understood and veri ed by legal experts. It must combine an unambiguous semantics based on FOL and deontic concepts with close syntactic proximity to conventional legal language.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        The development of knowledge-based legal information systems has been at the
forefront of research in arti cial intelligence and law for the last couple of decades
[
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. Knowledge-based legal information systems could support legal experts
in constructing and assessing legal arguments or in testing the drafts of new
statutes and regulations for consistency and coherence, and they could provide
intelligent means for specialised legal information retrieval. One of the main
obstacles to the development of knowledge-based legal information systems is
the fact that a large part of legal knowledge is captured in natural language
texts. For knowledge-based information systems to be employed, these texts
must be translated, by hand, into some formal representation [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The problem
with this requirement is that knowledge engineers do not have su cient legal
expertise to assess the accuracy of these formalisations, while legal experts are
usually not familiar with formal representations. Controlled natural language,
i.e. the use of a well-de ned subset of natural language that has been assigned an
unambiguous formal semantics, has been proposed as a means to bridge this gap
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Such an approach seems to gain further support from the recent application
of controlled natural language to similar domains: contracts [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], business rules
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], clinical guidelines [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>We are in the process of developing Controlled Legal German (CLG), a
logicbased controlled natural language designed for the representation of norms
codi ed in Swiss statutes and regulations. CLG aims at providing representations
that are formal and can be easily understood and veri ed by legal experts. In
this paper, we will rst give a brief overview of the state of development of the
project (section 2) and then focus on the speci c design requirements that arise
from our target application and the strategies we pursue to meet them in CLG
(section 3). The paper concludes with a brief discussion of some problems and
limitations and an outlook to future research (section 4).
2</p>
    </sec>
    <sec id="sec-2">
      <title>Controlled Legal German</title>
      <p>The project is currently in its rst phase, in which we specify the syntax and
semantics of the controlled language: we de ne what constructions CLG
sentences can consist of and what formal logical representations they map onto. In
a second phase, we plan to devise the computational tools needed to automatise
the mappings involved: a parser that translates CLG statements into the formal
logical representations speci ed in the de nition of the language and a converter
that makes the interpretations assumed in CLG explicit by generating respective
paraphrases (cf. section 3).</p>
      <p>
        CLG restricts the syntax and semantics of Swiss legal language to prevent
instances of ambiguity arising from constructions that either have more than
one syntactic analysis (syntactic ambiguity) or whose syntactic analysis can be
mapped onto more than one non-equivalent logical structure (semantic
ambiguity). Construction rules prohibit the use of certain ambiguous constuctions;
interpretation rules assign a default interpretation to others and suggest
paraphrases for excluded readings. Like in Attempto Controlled English [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], lexical
ambiguity is only controlled in function words; the de nition of the meaning of
content words is left to the user and to separate ontologies. CLG maps content
words onto atomic logical predicates; the often intended vagueness of the legal
concepts they stand for is thus fully preserved.
      </p>
      <p>CLG's unambiguous formal semantics is based on rst-order predicate logic
augmented with defeasible rules (A ; B) and the deontic concepts of obligation
(O), permission (P) and prohibition (:P). We have chosen a formal
underpinning that is expressive and \deep" enough to capture the essential content of
individual norms { who must or may do what, how and under what
circumstances? { and yet generic enough to be easily converted into other formats of
formal representation.</p>
      <p>The current version of the language, CLG 1.0, provides the basic syntactic
and semantic inventory to express simple norms (obligations, permissions,
prohibitions; including norms stating duties and responsibilities). It comprises roughly
two dozen construction and interpretation rules that deal with phenomena such
as attachment ambiguities (PPs, relative clauses), plural ambiguities
(distributive/collective readings), scope ambiguities, lexical ambiguities (function words),
referential ambiguities (pronouns, relational nouns) and functional ambiguities
due to the relatively free German word order. We will exemplify some of these
rules in the next section. The language does not yet include elements of temporal
and intensional logic; consequently, it does not yet permit the use of tenses other
than present tense and subordinate clauses other than conditional and relative
clauses. We plan to add these concepts in CLG versions 2.x and 3.x respectively.</p>
    </sec>
    <sec id="sec-3">
      <title>Design requirements</title>
      <p>The design of CLG has been guided by the two operations it must support: (a)
the translation, by hand, of legislative texts into CLG, carried out by knowledge
engineers; (b) the veri cation of the CLG representation by legal experts. The
translation of legislative texts into CLG is the easier the closer CLG resembles
the language used therein. Moreover, proximity to conventional legal language
will make lawyers feel more at ease with CLG transcriptions of statutes and
regulations and thus facilitate veri cation by domain experts. On the other hand,
the correctness of CLG transcriptions can only be veri ed by domain experts if
default interpretations are made explicit.</p>
      <p>
        CLG thus needs to accommodate two somewhat converse requirements: (a)
proximitiy to conventional legal language and (b) maximal explicitness. It does
so by two means. On the one hand, CLG exploits the conventions and frequency
distributions of ordinary legal language. Some constructions are ambiguous in
full natural language but not in the language of Swiss statutes and regulations,
and for some ambiguous constructions, one interpretation is used more frequently
in legislative texts than the other. Wherever possible, such conventions and
frequency distributions are re ected in CLG's construction and interpretation rules
[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. On the other hand, CLG contains syntactic sugar: for most meanings,
multiple constructions are on o er to the user { some of which resemble conventional
legal language more closely than others, and some of which are more explicit than
others. This fact means that users can rst translate a norm into a CLG text
that closely resembles the original. This relatively conventional representation
can then be transferred deterministically onto a semantically identical but more
explicit paraphrase, which can be used to verify the intended interpretation.
      </p>
      <p>As an example, suppose a knowledge engineer needs to formalise the following
norm:
(1) Die Bundesversammlung erlasst rechtsetzende Bestimmungen in der Form
des Bundesgesetzes [...]. (Art. 163 Abs. 1 Swiss Federal Constitution)
`The Federal Assembly enacts legislative provisions in the form of the federal
statute.'
It is not entirely clear whether this is (a) a norm about the responsibilities of the
Federal Assembly or (b) a norm about the enactment of legislative provisions by
the Federal Assembly. In CLG, the two interpretations can be expressed as shown
in (2) and (3) respectively. For each reading, we give a CLG representation that
is close to conventional legal language (C), an explicit paraphrase equivalent to
it (E), and the underlying formal semantics1 (F).
(2) C: Die Bundesversammlung erlasst rechtsetzende Bestimmungen in der</p>
      <p>Form des Bundesgesetzes.2
1 For the convenience of the reader, predicate names have been rendered in English.
2 Note that this CLG representation is positively identical to the original text; the
reading it stands for is thus particularly easy to extract.</p>
      <p>`The Federal Assembly enacts legislative provisions in the form of the federal
statute.'
E: Es ist zwingend, dass die Bundesversammlung eine oder mehrere
rechtsetzende Bestimmungen [in der Form des Bundesgesetzes erlasst].
`It is obligatory that the Federal Assembly [enacts in the form of the federal
statute] one or several legislative provisions.'
F: 9!x : federal assembly (x ) ^</p>
      <p>O O 9ey : legislative provision(y ) ^ enacts(e; x ; y ) ^</p>
      <p>O 9ey : in form of federal statute(e)3
(3) C: Der Erlass von rechtsetzenden Bestimmungen durch die
Bundesversammlung erfolgt in der Form des Bundesgesetzes.
`The enactment of legislative provisions by the Federal Assembly is e ected
in the form of the federal statute.'
E: Es ist zwingend, dass jeder Erlass einer rechtsetzenden Bestimmung
durch die Bundesversammlung in der Form des Bundesgesetzes erfolgt.
`It is obligatory that every enactment of a legislative provision by the Federal
Assembly is e ected in the form of the federal statute.'
F: 9!x : federal assembly (x ) ^</p>
      <p>O O 8ey : enacts(e; x ; y ) ^ legislative provision(y ) !</p>
      <p>O 8ey : in form of federal statute(e)
The paraphrases make the following CLG default interpretations underlying the
conventional CLG representations explicit: (a) norms without modal verb are
considered obligations (by preceding the sentence with the phrase es ist
zwingend, dass), (b) PPs attach to the verb (by inserting brackets), (c) inde nite NPs
in non-vorfeld position are interpreted as existentially quanti ed rechtsetzende
Bestimmungen (by using determiners such as eine oder mehrere), and (d)
nominalised verbs in vorfeld position (here der Erlass ) are interpreted as universally
quanti ed (by adding the determiner jeder ).
4</p>
    </sec>
    <sec id="sec-4">
      <title>Discussion</title>
      <p>
        The fact that legislative language is highly conventionalised facilitates the task of
designing a resembling controlled natural language to some degree [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]:
ambiguous constructions often already have a default interpretation in full legislative
language (due to the pragmatics of the domain), and these interpretations can be
turned into interpretation rules in the controlled natural language. What makes
our work more di cult, however, is the lack of thorough linguistic studies of how
speci c potentially ambiguous constructions (e.g. inde nite NPs, plurals, etc.)
are used in legislative language.
      </p>
      <p>The provision of explicit paraphrases often poses further problems: it is
simply not possible to nd explicit paraphrases for all constructions to which CLG
applies an interpretation rule. We have tried to partially overcome this problem
3 Adverbial phrases are not yet analysed in CLG 1.0 but treated as atomic conditions.
by admitting non-linguistic elements in these paraphrases. An example are the
brackets we use in (2E) to indicate the attachment of constituents.</p>
      <p>The foremost limitation of our project lies in the fact that the eld of arti cial
intelligence and law has not yet arrived at developing a logical formalism capable
of adequately representing the full content of statutory law. For the moment, we
have therefore chosen a formal underpinning for CLG that is expressive and
\deep" enough to capture what we deem the essential content of individual
norms { who must or may do what, how and under what circumstances? { and yet
generic enough to be easily converted into other formats of formal representation.
We believe that with such a generic representation, chances are best that a future
integration into arti cial intelligence and law systems that model legal reasoning
will be possible. Such systems will, of course, also have to take into account
information that is not contained in the actual norms that we want to represent
in CLG. Examples are the precedence of more speci c norms over more general
norms or the precedence of constitutional provisions over provisions stated in
statutes or regulations.</p>
      <p>Finally, the notorious underspeci city of normative stentences continues to
pose a problem for their translation into a controlled natural language: many
relations within sentences and between sentences (bridging references, ellipses,
rule-exception relations) are not made explicit in normative texts but are left
to be inferred from the context. In a controlled natural language, such relations
have to be stated explicitly. Providing formalisms for doing so will be one of the
largest task to be tackled by our future research.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Rissland</surname>
            ,
            <given-names>E.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ashley</surname>
            ,
            <given-names>K.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Loui</surname>
            ,
            <given-names>R.P.</given-names>
          </string-name>
          :
          <article-title>AI and law: A fruitful synergy</article-title>
          .
          <source>Arti cial Intelligence</source>
          <volume>150</volume>
          (
          <issue>1</issue>
          {2) (
          <year>2003</year>
          )
          <volume>1</volume>
          {
          <fpage>15</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Lodder</surname>
            ,
            <given-names>A.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oskamp</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          : Information Technology &amp;
          <article-title>Lawyers: Advanced technology in the legal domain, from challenges to daily routine</article-title>
          . Springer-Verlag New York, Inc., Secaucus, NJ, USA (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>McCarty</surname>
            ,
            <given-names>L.T.</given-names>
          </string-name>
          :
          <article-title>Deep semantic interpretation of legal texts</article-title>
          .
          <source>In: Proceedings of the 11th Int. Conference on AI and Law</source>
          , New York, ACM Press (
          <year>2007</year>
          )
          <volume>217</volume>
          {
          <fpage>224</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Hoey</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Walter</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Natural language interfaces</article-title>
          . In Walter, C., ed.:
          <source>Computer Power and Legal Language</source>
          , New York, Quorum (
          <year>1988</year>
          )
          <volume>135</volume>
          {
          <fpage>142</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Pace</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosner</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>A controlled language for the speci cation of contracts</article-title>
          . In Fuchs, N., ed.:
          <source>Controlled Natural Language</source>
          , Berlin, Springer (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Spreeuwenberg</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Anderson Healy</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>SBVR's approach to controlled natural language</article-title>
          . In Fuchs, N., ed.:
          <source>Controlled Natural Language</source>
          , Berlin, Springer (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Shi</surname>
            <given-names>man</given-names>
          </string-name>
          , R.N.,
          <string-name>
            <surname>Michel</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krauthammer</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fuchs</surname>
            ,
            <given-names>N.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kaljurand</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kuhn</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Writing clinical practice guidelines in controlled natural language</article-title>
          . In Fuchs, N.E., ed.:
          <source>Controlled Natural Language</source>
          , Springer (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Fuchs</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kaljurand</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kuhn</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Attempto Controlled English for knowledge representation</article-title>
          . In Baroglio,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Bonatti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Maluszynski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Marchiori</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Polleres</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          , Scha ert, S., eds.
          <source>: Reasoning Web</source>
          , Berlin, Springer (
          <year>2008</year>
          )
          <volume>104</volume>
          {
          <fpage>124</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9. Hoe er, S., Bunzli, A.:
          <article-title>Controlling the language of statutes and regulations for semantic processing</article-title>
          .
          <source>In: Proceedings of the LREC 2010 Workshop on Semantic Processing of Legal Texts (SPLeT</source>
          <year>2010</year>
          ), Valletta, Malta (
          <year>2010</year>
          )
          <volume>8</volume>
          {
          <fpage>15</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>