=Paper= {{Paper |id=Vol-2888/paper3 |storemode=property |title=Explainable Rule Extraction via Semantic Graphs |pdfUrl=https://ceur-ws.org/Vol-2888/paper3.pdf |volume=Vol-2888 |authors=Gabor Recski,Björn Lellmann,Adam Kovacs,Allan Hanbury |dblpUrl=https://dblp.org/rec/conf/icail/RecskiLKH21 }} ==Explainable Rule Extraction via Semantic Graphs== https://ceur-ws.org/Vol-2888/paper3.pdf
Explainable Rule Extraction via Semantic Graphs
Gabor Recski1 , Björn Lellmann2,3 , Adam Kovacs1,4 and Allan Hanbury1
1
  TU Wien, Vienna, Austria
2
  SBA Research, Vienna, Austria
3
  Federal Ministry for Digital and Economic Affairs, Vienna, Austria (Since May 3, 2021)
4
  Budapest University of Technology and Economics, Budapest, Hungary


                                             Abstract
                                             We present an end-to-end system for extracting deontic logic formulae from legal text using a generic semantic parsing
                                             module and task-specific graph grammars, and for performing automated reasoning on the extracted formulae. The pipeline
                                             enables automated compliance checking and is applied to text documents of the zoning map of the city of Vienna. All
                                             components are released as open-source software, the full pipeline is showcased in an online demo.

                                             Keywords
                                             semantic parsing, information extraction, automated reasoning, deontic logic



1. Introduction                                                  entirely rule-based, making our system an example of
                                                                 true explainable AI (XAI). Unlike in deep learning-based
We present an end-to-end system for extracting deontic information extraction systems, extracted rules can be
logic formulae from legal text using a generic semantic directly traced back to text patterns, making it straight-
parsing module and task-specific graph grammars, and forward to provide natural language explanations for
for performing automated reasoning on the extracted decisions made based on them. This explainable nature
formulae. An overview of the pipeline is shown in Fig. 1. also enables human-in-the-loop operation and provides
Plain text regulations are processed by a pipeline of safeguards against biased decision-making. Our main
domain-agnostic language processing tools, including a contributions are:
system for building syntax-independent concept graphs
that represent the meaning of each sentence. These                    • Specification of a formal representation of deontic
graphs serve as the input for a task-specific rule extrac-               statements including those of the construction
tion module that maps them to deontic logic formulae,                    regulation domain for automated rule extraction
which in turn are used in an automated reasoning system.                 and reasoning
The proposed pipeline is applied to text documents of the
                                                                      • A preprocessed, structured corpus of sentences
zoning map of the city of Vienna1 , an exciting corpus of
                                                                         extracted from the zoning plan of the City of Vi-
legal regulations whose highly structured nature renders
                                                                         enna, a small subset of which is annotated manu-
it very well suited for formal approaches. In absence of a
                                                                         ally with formal rule representations
large-scale annotated corpus we evaluate our approach
on a toy dataset of manually analyzed sentences that                  • A grammar-based system for explainable rule ex-
were selected to cover the most frequent attributes in                   traction from semantic graphs, evaluated on the
the full dataset. Possible applications include automated                annotated corpus
compliance checking and question answering. The se-
mantic parser and rule extractor components are both                  • The adaption and extension of a general theorem
                                                                         prover to the reasoning domain, including natural
Proceedings of the Fifth Workshop on Automated Semantic Analysis         language output
of Information in Legal Text (ASAIL 2021), June 25, 2021, São Paulo,
Brazil.                                                                                                                        • System architecture and working prototype for
" gabor.recski@tuwien.ac.at (G. Recski); lellmann@logic.at
                                                                                                                                 an end-to-end system for rule extraction and au-
(B. Lellmann); adam.kovacs@tuwien.ac.at (A. Kovacs);
allan.hanbury@tuwien.ac.at (A. Hanbury)                                                                                          tomated reasoning from raw text
 0000-0001-5551-3100 (G. Recski); 0000-0002-5335-1838
(B. Lellmann); 0000-0001-6132-7144 (A. Kovacs);                                                                              The paper is structured as follows. In Sec. 2 we re-
0000-0002-7149-5843 (A. Hanbury)                                                                                          view recent work on semantic parsing, automatic rule
                                       © 2021 Copyright for this paper by its authors. Use permitted under Creative
                                       Commons License Attribution 4.0 International (CC BY 4.0).                         extraction, and automated reasoning for legal-tech ap-
    CEUR

          CEUR Workshop Proceedings (CEUR-WS.org)
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073

                                                                                                                          plications, and present the dependencies of our pipeline:
                  1
      See        https://www.wien.gv.at/flaechenwidmung/public/                                                           a task-independent semantic parser and a deontic logic
for the map and https://www.data.gv.at/katalog/dataset/
flachenwidmungs-und-bebauungsplan-plandokumente-wien                                                                      prover. Sec. 3 presents the rule extraction method, Sec. 4
for how to obtain the text documents (in German).                                                                         describes the architecture of the full pipeline. Sec. 5



                                                                                                                      1
Gabor Recski et al. CEUR Workshop Proceedings                                                                                                    1–13



 PDF documents         Text segmentation    Sentences    Semantic parsing          Concept graphs       Rule extraction   Deontic rules   Reasoning


Figure 1: Overview of our pipeline. The solid rectangle indicates the newly contributed component and format, dashed
rectangles mark existing tools that we modify or extend.



provides preliminary evaluation of our rule extraction                          representations in NLP pipelines. Abstract Meaning Rep-
method, Sec. 6 discusses next steps and draws some con-                         resentations (AMRs) [1] represent sentence meaning as
clusions. All components of our system are available                            directed graphs of words, but do not provide a model
as open-source software2 . The example application is                           of word meaning and are highly English-specific, not
showcased in an online demo3 .                                                  intended as a language-independent framework of mean-
                                                                                ing representation. The task of parsing raw text to AMR
                                                                                graphs has recently attracted growing interest and is usu-
2. Related work                                                                 ally performed using deep neural networks [2, 3] trained
                                                                                on annotated corpora, also called sembanks. Universal
The related work for the two main components of the sys-
                                                                                Conceptual Cognitive Annotation (UCCA) [4] takes a
tem is presented: (i) semantic parsing for extracting logic
                                                                                language-agnostic approach to meaning representation,
expressions directly from legal text, and (ii) deontic logic
                                                                                modeling sentence meaning with directed acyclic graphs
for performing automated reasoning on the extracted
                                                                                (DAGs) representing scenes evoked by predicates. Top
formulae.
                                                                                UCCA parsers also rely on manually annotated corpora
                                                                                and neural networks [5, 6, 7, 8]. Since these frameworks
2.1. Semantic parsing                                                           do not provide a generic parsing algorithm, building
Semantic parsing is the task of automatically mapping                           representations for a new language and/or new domain
natural language text to a formal representation of its                         would require the manual compilation of large annotated
meaning. Most contemporary architectures for solving                            datasets that could be used to train end-to-end machine
information extraction tasks do not perform semantic                            learning models. For the pipeline presented in this paper
parsing, instead relying on models which directly en-                           we choose the language-independent 4lang framework
code the correspondence between natural language text                           [9], for which a robust parsing method [10] with an open-
and some set of task-specific structures such as labels,                        source implementation4 [11] is also available.
sequences, attribute-value structures, etc. These mod-                             The 4lang framework represents the meaning of both
els are primarily built using machine learning methods,                         words and larger units like phrases and sentences as di-
whose performance is dependent on the quality and quan-                         rected graphs of concepts. The representation is syntax-
tity of available training data and whose decisions are                         independent, concepts do not have types such as part-of-
difficult to interpret and prone to bias. Rule-based mod-                       speech or even as predicate and argument. A key feature
els, on the other hand, require considerable expert effort                      of 4lang graphs that enables uniform treatment of syn-
                                                                                                                                       0
to build and maintain and can still be difficult to adapt                       tactically different constructions is the 0-relation (−→), a
to changes in the task definition. The architecture we                          single representation for a range of closely related seman-
                                                                                                                                          0
propose tackles the information extraction task in two                          tic relationships such as the ISA relationship (e.g. roof −
                                                                                                                                          →
steps: the mapping of natural language text to task- and                        covering), attribution (e.g. sidewalk −
                                                                                                                       0
                                                                                                                       → paved), and pred-
domain-independent meaning representations (semantic                            ication (e.g. platform −
                                                                                                         0
                                                                                                        → extend). 4lang graphs can be
parsing) followed by a task-specific information extrac-                        built from raw text automatically using a rule-based sys-
tion step that operates on these representations. In this                       tem that uses Universal Dependencies (UD) [12] as an
section we give a brief overview of common approaches                           intermediate step. UD trees encode grammatical relations
to semantic parsing and of the representation framework                         (dependencies) between pairs of words in a sentence — an
used by our pipeline.                                                           example of such an analysis is shown in Fig. 2, described
   Unlike other forms of automated linguistic annotation                        later. UD parsers are available for dozens of languages,
such as part-of-speech tagging or syntactic parsing, se-                        in this pipeline we use the stanza package5 [13]. The
mantic parsing is not in any way standardized in the                            transformation of UD trees into 4lang graphs is based
natural language processing (NLP) community. Even                               on a small set of rules described in [10] and implemented
those few frameworks that have recently attracted grow-                         by [11] as parsing and decoding of an Interpreted Regu-
ing interest in the task are rarely used as intermediate                        lar Tree Grammar (IRTG) [14], a formalism that we also
                                                                                use in this work for implementing the rule extraction
    2                                                                               4
        https://github.com/recski/brise-plandok                                         https://github.com/adaamko/wikt2def
    3                                                                               5
        https://ir-group.ec.tuwien.ac.at/brise-extract                                  https://stanfordnlp.github.io/stanza




                                                                            2
Gabor Recski et al. CEUR Workshop Proceedings                                                                                    1–13



mechanism that maps 4lang semantic graphs to trees of                       minimal height of 10 metres. Due to this property the rea-
attributes (see Sec. 3.2). Most rules map a single UD edge                  soning process is non-monotone: While from the single
between two content words to 4lang edges connecting                         assumption obl(buildingHeightMin(10), ⊤) we can
the corresponding concepts (e.g., in the example in Fig. 2,                 derive the formula
the relations amod and nsubj : pass are mapped to a
                                                                            obl(¬buildingHeightExactly(8), facingStreet),
0-edge and a 2-edge, respectively).
                                                               stating that the height of buildings facing the street
2.2. Automated reasoning and deontic                           must not be exactly 8 metres, we cannot derive
     logic                                                     the same formula with the additional assumption
                                                               obl(buildingHeightMax(9), facingStreet)               any-
The investigation of automated reasoning methods in the more. As a second modification from the original
legal domain has a long history, see, e.g., the seminal [15]. deonticProver2.0, here we consider a different conflict
More recently, automated reasoning methods have been resolution mechanism, resulting in particular in higher
considered for legal texts or company regulations [16, 17], efficiency of the implementation. Reasoning in this
and a large number of reasoning frameworks and tools logic is implemented via backwards proof search in a
are available, see, e.g., [18, 19] for an overview. Follow- sequent system with underivability statements, both in
ing the approach in [17], here we consider a general and the system from [21] and in a slight modified reasoning
formalism-independent representation of the regulations, engine. See op. cit. for the details of the original system
which can be translated into different frameworks. For- and Sec. 4.4 for the modifications.
tunately, the structure of the intended application, the          We are thankful to one of the reviewers for bringing ad-
regulations of the zoning map of Vienna, is relatively ditional relevant literature to our attention [22, 23, 24, 25,
clear, and mostly does not require advanced features of 26]. Unfortunately, space and time constraints prevented
the formal language such as nested deontic operators in a detailed comparison for the final version.
the assumptions [16] or macros [20].
   The specific reasoning engine used in this paper, the
theorem prover BRISEprover6 , is an extension and modi- 3. Rule extraction
fication of the theorem prover deonticProver2.07 devel-
                                                               The pipeline presented here takes as its input raw text
oped in [21] for reasoning with assumptions in dyadic de-
                                                               documents containing regulations of the zoning map of
ontic logic. In this logical framework, propositional logic
                                                               the City of Vienna, builds representations of their mean-
is extended with dyadic deontic operators obl, for and per.
                                                               ing using the 4lang system (see Sec. 2.1), uses the re-
Formulae obl(𝐴, 𝐵), for(𝐴, 𝐵) and per(𝐴, 𝐵) are read
                                                               sulting semantic graphs to extract the legal content of
as “it is obligatory that 𝐴 given 𝐵”, “it is forbidden that
                                                               regulations and makes them available to the prover (see
𝐴 given 𝐵” and “it is permitted that 𝐴 given 𝐵”, respec-
                                                               Sec. 2.2), which verifies whether some statement is deriv-
tively. As the first extension considered here we extend
                                                               able given a set of assumptions. The full architecture is
the language with predicate symbols to capture proper-
                                                               described in Sec. 4, we now present the novel rule extrac-
ties, e.g., “building height at most 9 metres”, in atomic for-
                                                               tion component and its interfaces to semantic parsing
mulae, e.g., buildingHeightMax(9). Note that it would
                                                               and automated reasoning.
be straightforward to include an additional argument rep-
resenting the subject, i.e., which building has a height of
at most 9 metres. Since in our case this is clear from the 3.1. Representation
context we simplify the representation by assuming the            Since we do not want to commit to modeling regu-
subject is always the same. The prover decides derivabil- lations in a particular formalism, and to facilitate the
ity from a set of factual and deontic assumptions, i.e., non- integration of different reasoning engines we first con-
deontic and non-nested deontic formulae respectively. vert the legal content of the regulations into a generic
The reasoning engine supports the specificity principle representation. For this we assume that a deontic regula-
in the form that more specific deontic assumptions over- tion, i.e., a regulation stating an obligation, prohibition
ride less specific conflicting ones. E.g., the assumption or permission, is comprised of the following parts:
obl(buildingHeightMax(9), facingStreet), stating
that buildings facing the street must have a maximal                 • Modality: This states whether the regulation is
height of 9 metres, overrides the less specific assump-                an obligation, a prohibition or a permission;
tion obl(buildingHeightMin(10), ⊤), stating that un-
                                                                     • Content: The content of the regulation, i.e., what
der the always true condition ⊤ buildings must have a
                                                                       is obligatory / prohibited / permitted;
    6
        See http://subsell.logic.at/bprover/briseprover/                         • Conditions: The conditions of the regulation stat-
    7
        See http://subsell.logic.at/bprover/deonticProver/version2.0/              ing when the regulation applies;



                                                                        3
Gabor Recski et al. CEUR Workshop Proceedings                                                                                           1–13


                                                               nsubj:pass
                    Dachneigung
                                                                                         aux:pass
                                                                                                              obl
                           case             nmod                                                nmod

                              case             case                         case                     det

                                     det           nummod                          det                     amod              mark


         NOUN     ADP ADP DET    NOUN      ADP NUM NOUN AUX       ADP      DET NOUN DET      ADJ        NOUN       PART VERB
      Flachdächer bis zu einer Dachneigung von fünf Grad sind entsprechend dem Stand der technischen Wissenschaften zu begrünen




                                                                                                    {"modality": "obligation",
                                                                                                     "attributes": [
                                                                                                        {"type": "content",
                                                                                                         "name": "BegruenungDach",
                                                                                                         "value": null},

                                                                                                           {"type": "condition",
                                                                                                            "name": "Dachart",
                                                                                                            "value": "Flachdach"},

                                                                                                           {"type": "condition",
                                                                                                            "name": "DachneigungMax",
                                                                                                            "value": "5Grad"}}



Figure 2: Universal Dependency analysis, 4lang semantic graph, and formal rule representation for the sentence Flachdächer
bis zu einer Dachneigung von fünf Grad sind entsprechend dem Stand der technischen Wissenschaften zu begrünen. ‘Flat roofs
with a pitch not exceeding 5 degrees must be greened using state of the art technologies.’



     • ConditionExceptions: Possible exceptions to the                   As an example, the generic representation of the de-
       conditions, stating when the regulation does not                ontic regulation “Flat roofs should be green roofs unless
       apply. E.g., in the regulation “Flat roofs should               they are glass roofs” is given by:
       be green roofs, unless they are glass roofs”, the               { ”modality” : ”obligation”,
       “glass roofs” is an exception to the condition.                   ”attributes” : [
                                                                           { ”name” : ”roofType”, ”value” : ”flatRoof”,
     • ContentExceptions: Possible exceptions to the
                                                                             ”type” : ”condition”},
       content. E.g., in the regulation “Windows are
                                                                           { ”name” : ”greenRoof”, ”value” : NIL,
       prohibited except for portholes” the “portholes”
                                                                             ”type” : ”content”},
       are an exception to the content.
                                                                           { ”name” : ”roofType”, ”value” : ”glassRoof”,
This level of granularity seems to capture the necessary                     ”type” : ”conditionException”}]}
details of the sentences found in the documents from the     Here we modelled both flat roofs and glass roofs as roof
city of Vienna zoning map while being flexible enough to     types, while modelling the property of being a green
permit translation into different frameworks like dyadic
                                                             roof as an atomic propositional statement because the
deontic logic, defeasible deontic logic [16], argumenta-
tion based approaches [27] or input output logic [28].       latter in the documents corresponds to a more complex
Concretely, we represent this structure as a JSON object     proposition.
                                                                Note that in contrast to, e.g., the approach in [17] at
     {”modality” : Modality, ”attributes” : List}            this stage we do not commit to a particular modelling of
                                                             exceptions by negation-as-failure or negated conditions.
 where the key “modality” takes one of the values “obli- This retains the flexibility of the general format necessary
gation”, “prohibition”, “permission”, and where List is
                                                             for subsequent specification into a large number of dif-
an array containing attributes of the following form:
                                                             ferent formalisms. There are of course some limitations
    {”name” : Name, ”value” : Value, ”type” : Type}.         inherent in this representation, in particular we do not
                                                             model negation. See Sec. 6 for a more detailed discussion.
 Here Name is the name of the attribute, Value is its value,
and Type is one of “Content”, “Condition”, “ConditionEx-
                                                             3.2. Extraction
ception”, “ContentException”. We obtained the attribute
names via collaboration with domain experts from the The mapping from semantic graphs to the rules described
City of Vienna Baupolizei and verified that the structure is implemented in two steps. An IRTG grammar similar
is appropriate by manually annotating several hundred to that of the semantic parsing system (see Sec. 2.1) is
sentences from the documents of the zoning map.              used to extract all attributes, values, and expressions of



                                                                   4
Gabor Recski et al. CEUR Workshop Proceedings                                                                                            1–13



modality that occur within a single sentence. A simple
heuristic then matches these elements with each other in
order to create the generic rule representations described
in the previous section. The mapping we establish is
between patterns in generic semantic graphs and formal
rules. This two-step approach is an arbitrary simplifica-
tion that makes implementation simpler and more flexi-                                                                                      ⇕
ble. We shall now describe the approach using some ex-
amples and point out some of its current limitations. The
first component of our rule extraction system is an IRTG                OBL

grammar mapping 4lang graphs to lists of strings rep-
resenting attribute names (e.g. DachneigungMax ‘max-                          BegruenungDach

imal roof pitch’), modalities (e.g. OBL ‘obligation’), as
well as numbers and units of measurement that may be
interpreted as attribute values (e.g. 5 and m). We shall                                       Dachart v_Flachdach DachneigungMax
refer to this grammar as fl_to_attr. The heuristics
for matching values and modalities to attribute names,                                                                              q_Grad v_5

described later in this section, can only disambiguate
                                                                   Figure 3:    Example mapping implemented by the
between multiple solutions if it is informed about the             fl_to_attr grammar for the sentence in Figure 2.
positions of patterns relative to each other. Hence we
use a Tree Grammar as the output interpretation, which
                                                                   E -> a_gebaeudehoehe_maximal(E, E) [100]
allows us to represent these strings as leaves of a tree           [fl] f_src(f_tgt(merge(r_src(?1), merge(r_tgt(?2),
that resembles the order in which they were recognized,            "(u / Gebaeudehoehe :0 (v / maximal))"))))
corresponding to steps of composing the 4lang graph                [attr] *(*(?1, ?2), *("OBL", "GebaeudeHoeheMax"))
from subgraphs. An example of this mapping is presented
in Fig. 3. Each IRTG rule encoding the correspondence              Figure 4: Example rule of the fl_to_attr IRTG grammar.
between a 4lang subgraph and an attribute or tree of               The first interpretation line is wrapped for readability, the op-
attributes is a mapping between rule applications in an            erations are explained in the text.
s-graph algebra [29] and a tree algebra. S-graphs are
graphs whose nodes may be marked by special labels
called sources and the s-graph algebra’s core operation            tation by heuristically matching modalities as well as po-
merge creates new s-graphs by taking the union of its              tential attribute values to attribute names. We illustrate
arguments but merging nodes that have the same source.             this process using an example with multiple attributes;
We now illustrate this mechanism with a simple exam-               consider the following sentence fragment, a subordinate
ple, for a more detailed introduction to s-graph algebras          clause of a longer regulation: bei einer Straßenbreite ab 10
and their application to semantic parsing the reader is            m entlang der Fluchtlinien Gehsteige mit einer Breite von
referred to [29]. The IRTG rule presented in Fig. 4 de-            mindestens 2,0 m herzustellen sind. ‘in case of a street
fines two binary operations, to be performed in parallel           width of 10 m or more, sidewalks with a width of at least
on corresponding pairs of s-graphs and trees. The op-              2.0 m are to be constructed along the alignment lines.’.
eration of the 4lang interpretation merges two graphs              The pipeline described so far will extract from this sen-
along their root nodes with the two nodes of the edge              tence three attribute names (StrassenbreiteMin ‘min-
Gebäudehöhe −
                0
               → maximal to create a single graph, while           imum street width’, GehsteigbreiteMin ‘minimum
the operation of the attr interpretation merges the two            sidewalk width’, AnFluchtlinie ‘along the alignment
attribute trees with each other and subsequently with a            line’), two numbers (10, 2.0), two occurrences of the unit
tree of two nodes (OBL, GebaeudeHoeheMax). S-graph                 of measurement m, and the modality OBL ‘obligation’ (the
algebras use three types of operations: the merge opera-           latter based on the word form herzustellen composed of
tion merges two graphs on nodes with matching sources              the verb herstellen ‘construct, produce’ and the infinitive
while the rename and forget operations can be used to              marker zu). These eight elements are organized in a tree
change or delete sources. In the example rule in Fig. 4            structure according to the order in which they appeared
the two argument graphs are renamed so that their root             in the IRTG derivation of the corresponding semantic
sources become src and tgt, and these sources need to              graph, shown in Fig. 5. The tree of attributes is stored
be deleted after the merge operation.                              in a custom data structure that in every node stores the
   Once all relevant strings have been extracted from the          length of the shortest path between any pair of attributes
semantic graph, the next step is to build the rule represen-       below that node. This means that by querying the root of
                                                                   the tree we can retrieve for any attribute a list of all other



                                                               5
Gabor Recski et al. CEUR Workshop Proceedings                                                                                             1–13



                                                                                     These simple heuristics, which are already capable
                                                                                  of correctly matching attributes to their values and for
 AnFluchtLinie
                                                                                  distinguishing between the roles each attribute plays in
                                                                                  a rule, allow us to keep the grammar simpler than it
                 OBL
                                                                                  would be if it was to directly generate structures like our
                       GehsteigbreiteMin
                                                                                  generic rule representations. The IRTG rules currently
                                                                                  used (and exemplified in Fig. 4) simply represent one-
                                                                                  to-one correspondences between 4lang subgraphs and
                                                                                  strings, but the underlying Regular Tree Grammar only
                                           StrassenbreiteMin          m 2.0       uses a single nonterminal symbol, i.e. rules are not sen-
                                                                                  sitive to which other rules were used to construct their
                                                               m 10               arguments. It would be quite straightforward to intro-
                                                                                  duce non-terminal symbols representing attribute names,
                                                                                  numbers, measurement units, and modalities, so that the
                                                                                  IRTG itself would enforce the structure that is currently
                                                                                  built in a postprocessing step. The tree in Figure 5 re-
Figure 5: Example of attribute matching for the sentence
                                                                                  sembles the order in which patterns corresponding to
. . . bei einer Straßenbreite ab 10 m entlang der Fluchtlinien
Gehsteige mit einer Breite von mindestens 2,0 m herzustellen                      each element were found in the semantic graph, which
sind. ‘in case of a street width of 10 m or more, sidewalks                       in turn correspond to disjoint subgraphs of the semantic
with a width of at least 2.0 m are to be constructed along the                    representation that are each connected to the concept
alignment lines.’                                                                 herstellen and roughly correspond to fragments of the
                                                                                  original sentence such as Gehsteig mit einer Breite von
                                                                                  mindenstens 2,0 m herstellen ‘construct sidewalk with a
                                                                                  width of at least 2.0 m’, bei einer Straßenbreite ab 10 m
attributes ranked by their relative distance in the tree. We
                                                                                  ‘with a road width of at least 10 m’, entlang der Fluchtlin-
first match all units of measurement to the nearest value
                                                                                  ien ‘along the alignment lines’, etc. Here we limit our
in the tree, allowing each value to be associated with at
                                                                                  grammar to the task of understanding each of these pat-
most one unit of measurement. Next, all non-boolean at-
                                                                                  terns independently because this proves sufficient for our
tributes are matched to the nearest value, using a greedy
                                                                                  purposes of constructing formal rules from sentences of
algorithm: all possible attribute-value pairs are sorted
                                                                                  the zoning map of the City of Vienna, which tend to be
by their relative distance in the tree, the pair with the
                                                                                  in a one-to-one correspondence with rules of the general
shortest path is stored as a match, its members are re-
                                                                                  structure described in Sec. 3.1. I.e., we greatly simplify
moved from the lists of attributes and values that are still
                                                                                  our task by exploiting the fact that authors of this piece of
to be matched, and this step is repeated until at least one
                                                                                  legislation rarely express a rule in multiple sentences or
of the lists becomes empty. For example, in Fig. 5 the
                                                                                  incorporate several rules in a single sentence. A notable
attribute StrassenbreiteMin ‘minimum road width’ is
                                                                                  exception is when some conditions such as the ID or des-
paired with the value 10, since they are the closest of
                                                                                  ignation of the areas that a rule refers to are not repeated
any pairs of attribute and value in the tree (the attribute
                                                                                  in every sentence within the same section. These condi-
AnFluchtLinie is excluded from this process because it
                                                                                  tions are propagated by a simple inheritence mechanism
is listed as a boolean attribute). In the second step the at-
                                                                                  that assumes the values of such attributes to hold within
tribute GehsteigbreiteMin ‘minimum sidewalk width’
                                                                                  a single section (see Sec. 4 for how section boundaries
is matched with the only remaining value, 2.0. Finally,
                                                                                  are detected).
the type of each attribute must be detected, i.e. it must be
determined whether an attribute is a condition of the
rule, part of the content, or an exception to either one of                       3.3. Specification to dyadic deontic logic
these (contentException, conditionException, see                                  The extracted rules in the generic format are then trans-
Sec. 3.1). Some attrbutes are explicitly listed as always be-                     lated into the language of dyadic deontic logic. We chose
ing of type condition, e.g., Planzeichen and Widmung                              this particular framework because the sentences of the
which refer to the ID and designation of an area. Next, the                       zoning map of Vienna exhibit a very clear deontic struc-
extracted modality elements OBL, FOR, EXC are matched                             ture, in contrast, e.g., to the largely definitional character
to the nearest of the remaining attributes, which are                             of the British Nationality Act investigated in [15]. The
in turn determined to be part of the content (in case                             translation is dependent on the modality in the following
of FOR and OBL) or a conditionException (in case of                               way. Given a rule representation
EXC). Finally, all remaining attributes are given the type                            {”modality” : Modality, ”attributes” : List}
condition.




                                                                              6
Gabor Recski et al. CEUR Workshop Proceedings                                                                              1–13



let                                                               4.1. Preprocessing and segmentation
            Cnd := cnd1 (cndV1 ) ∧ · · · ∧ cnd𝑛 (cndV𝑛 )          The input to our pipeline consists of PDF documents
      CntEx := ctEx1 (ctExV1 ) ∨ · · · ∨ ctEx𝑚 (ctExV𝑚 )          downloaded from the public website of the City of Vi-
      CndEx := cdEx1 (cdExV1 ) ∨ · · · ∨ cdExℓ (cdExVℓ )          enna. Each PDF document contains regulations pertain-
                                                                  ing to one zoning area (Plangebiet), indicated by a four-
 where the cnd𝑖 are all the attributes occurring with type        digit ID. We discard the fraction of documents that are
“condition” in List, and the cndV𝑖 are their respective           scanned images of printed documents and do not con-
values, and similarly for ctEx for type “contentException”        tain machine-readable text data (253/1431 = 17.7%) —
and cdEx for type “conditionException”. For the content,          we could include these in our experiments by running
let further
                                                                  optical character recognition (OCR). PDF documents are
              Cnt := cnt1 (cntV1 ) ∧ · · · ∧ cnt𝑚 (cntV𝑚 )        then converted to plain text using the pdftotext util-
            Cntfor := cnt1 (cntV1 ) ∨ · · · ∨ cnt𝑚 (cntV𝑚 )       ity, part of the open-source Poppler library10 . We use
                                                                  the −layout option of the tool to maintain page layout
Where again the cnt𝑖 are the attributes of type “content”         in the output text file, this greatly simplifies the subse-
and the cntV𝑖 their respective values. The translation of         quent extraction of document structure. Next we use
a rule representation with modality “obligation” then is:         a small set of regular expressions to establish section
  obl(Cnt ∨ CntEx, Cnd) ∧ per(¬(Cnt ∨ CntEx), CndEx)              boundaries and extract section numbers from the text.
                                                                  Section numbering often makes use of several levels (e.g.
The translation of a rule with modality “prohibition” is:         1, 1.1, 1.1.1, etc.), but this is not consistent across doc-
 for(Cntfor ∧ ¬CntEx, Cnd) ∧ per(Cntfor ∧ ¬CntEx, CndEx)          uments, therefore we only consider top-level sections
                                                                  in subsequent steps that are sensitive to section bound-
For the modality “permission” the translation is:                 aries. Besides the inheritance mechanism described in
      per(Cnt ∨ CntEx, Cnd) ∧ for(Cnt ∨ CntEx, CndEx)             Section 3.2, this decision is crucial for sentence segmen-
                                                                  tation, the next step in our pipeline, for which we use
 Note that this translation commits to formalising, e.g.,         a customized version of the German sentence splitting
condition exceptions to obligations or prohibitions as            model of the stanza11 [13] library. The output of the stan-
additional permissions. Of course this is by no means             dard model is postprocessed to undo sentence splits that
the only possible translation: we could have chosen to            have been made in error (e.g. those after periods follow-
embed the condition exception explicitly in the condi-            ing abbreviations characteristic of legal text) and also
tion of the resulting formula. This choice is due to the          those made after colons (:) that separate a predicate from
fact that it facilitates the derivation of a general state-       its object(s), such as in the text Für die mit BB4 bezeich-
ment like “Flat roofs should be green roofs” from “Flat           neten Grundflächen wird bestimmt: Die Errichtung von
roofs should be green roofs unless they are glass roofs”          Gebäuden mit einer maximalen Gebäudehöhe von 8 m ist
in the particular logic used in the prover. In particular,        zulässig. ‘For areas marked BB4 it is determined: con-
when checking whether a flat roof in general should be            struction of buildings with a maximum building height
a green roof we do not need to explicitly state that none         of 8 m is allowed.’. The custom sentence segmentation
of the condition exceptions are satisfied, in line with the       step is followed by stanza’s default German pipeline
standard approach in non-monotonic logic and default              (de − gsd, stanza model version 1.1.0) for tokenization,
reasoning [30].                                                   part-of-speech (POS) tagging and universal dependency
                                                                  parsing.
4. Architecture
                                                                  4.2. Semantic parsing
We describe the system architecture of the full pipeline
                                                                  The next step in our pipeline is to construct semantic
that takes raw text documents as input, builds semantic
                                                                  graphs from each sentence. The rule extraction algorithm
graphs using the system described in Section 2.1, ex-
                                                                  described in Section 3 assumes that all relevant informa-
tracts rules using our method presented in Section 3.2,
                                                                  tion present in the input text is available in the semantic
maps them to deontic logic formulae as described in Sec-
                                                                  graph that is the output of the generic semantic parsing
tion 3.3 and provides them as input to the prover (see
                                                                  pipeline described in Section 2.1. To ensure that this is the
Section 2.2). All components of our system are available
                                                                  case some minor modifications of the semantic parsing al-
as open-source software8 under an MIT license and the
                                                                  gorithm were also necessary. First, we introduced a small
end-to-end pipeline is showcased in an online demo9
                                                                  set of rules in the grammar mapping Universal Depen-
integrating all of them.
      8                                                              10
          https://github.com/recski/brise-plandok                         https://gitlab.freedesktop.org/poppler/poppler
      9                                                              11
          https://ir-group.ec.tuwien.ac.at/brise-extract                  https://stanfordnlp.github.io/stanza




                                                              7
Gabor Recski et al. CEUR Workshop Proceedings                                                                             1–13



dency representations to semantic graphs for common                 to the output in parallel to the construction of the seman-
words expressing negation and modality. The lemmas                  tic graph, it is this additional information that allows us
nicht and kein trigger the addition of the NEG element to           to implement the matching heuristics described in Sec-
the 4lang graph, dürfen ‘may’ and zulässig ‘permitted’              tion 3.2. For parsing of 4lang graphs and generation of
are mapped to PER, untersagen ‘prohibit‘ and unzulässig             attribute trees with these IRTGs we use the open-source
‘not permitted’ to FOR, and müssen to OBL. Additionally,            alto12 library, which also implements s-graph algebras
the German construction consisting of the particle zu               and tree algebras. The alto system also supports proba-
followed by the infinitive form of a verb must also trigger         bilistic parsing with weighted grammars, and we rely on
the OBL element, since it can express modality without              rule weights to ensure that rules which map subgraphs to
any additional linguistic elements. This latter rule is im-         attributes always take precedence over the ‘empty’ rules
plemented by two mechanisms, one that looks for the                 that are only added to the grammar to ensure that the
lemma zu with the universal part-of-speech tag (UPOS)               full graph is derivable. In those few cases when more
PART, the other for the language-specific part-of-speech            than one such ‘content’ rule matches the same subgraph,
tag (XPOS) VVIZU marking verbs that contain the par-                precedence is given to rules that cover larger substruc-
ticle as an infix (e.g. herzustellen from herstellen ‘create,       tures. The trees output by the IRTG parser serve as the
produce’. While even the most rudimentary treatment                 input to the heuristic construction of rules described in
of the semantics of German modal expressions would                  the previous section. Finally, rules are converted from the
go beyond the simplicity of such a simple categorization            generic (JSON) format to the language of dyadic deontic
(and the scope of this work), in practice this small en-            logic, as described in Section 3.3.
hancement of the semantic representation of the input
text was sufficient to allow for the detection of modality          4.4. The prover
by the rule extraction mechanism. Finally we also added
an ad-hoc rule for detecting exceptions: the presence       The final step in our pipeline consists of an exemplary rea-
of the word sofern and soweit, both roughly equivalent      soning mechanism to draw inferences from the extracted
to the English conjunction ‘provided’ and introducing a     rules. This step is based on our adaption13 of the generic
                                                            theorem prover deonticProver2.014 which implements
clause that limits the applicability of a previous statement,
                                                            backwards proof search in a sequent system for a dyadic
triggers the addition of an element EXC to the semantic     deontic logic extended with rules for defeasibly reasoning
graph which is then also available for processing by the    from deontic assumptions [21]. Apart from specifying
rule extraction mechanism.                                  the prover to the language obtained from the examples
                                                            we needed to further modify it in two ways. First, in order
4.3. Rule extraction                                        to be able to handle attributes with numerical arguments,
                                                            such as DachneigungMax for the maximal angle of the
We now describe the implementation details of the two- roof, or with strings as argument, such as Dachart for
step rule extraction method presented in Section 3.2. The the roof type, we extended the prover and the underly-
output of the semantic parser, which serves as the input ing reasoning system to handle atomic propositions with
to rule extraction, is a single directed graph for each in- arguments. In addition, we added ground sequents, i.e.,
put sentence, generated by an Interpreted Regular Tree structures which can be used as leaves in a derivation, cor-
                                                            responding to basic properties of measure-like attributes
Grammar from Universal Dependency structures (see
                                                            with natural numbers as values: Where msr is a basic
Section 2.1 for details). For recognizing subgraphs and attribute for a measure such as Dachneigung, we con-
mapping them to attributes we also use an IRTG over sidered a triple consisting of the attributes msrGenau(𝑛),
an algebra of s-graphs, this allows us to pipe the output msrMin(𝑛) and msrMax(𝑛), expressing the facts that msr
of the semantic parser directly into our rule extraction is exactly 𝑛, at least 𝑛, or at most 𝑛, respectively. The
grammar. For each 4lang graph we dynamically gener- relations between these three attributes are given by:
ate a unique grammar. The static set of rules encoding             • msrGenau(𝑛) → msrMin(𝑛) ∧ msrMax(𝑛)
the correspondence between generic semantic structures
                                                                   • msrMin(𝑛) → msrMin(𝑚), where 𝑚 ≤ 𝑛
and task-specific attributes is extended with empty termi-
nal rules for each concept in the input graph, this ensures        • msrMax(𝑛) → msrMax(𝑚), where 𝑛 ≤ 𝑚
that the entire graph can be constructed by a sequence of          • msrMax(𝑛) → ¬msrMin(𝑚), where 𝑛 < 𝑚
operations that is derivable by the underlying RTG and The ground sequents added to the prover then absorb ba-
thus the object can be parsed by the IRTG. The output sic reasoning on these axioms, so that, e.g., the formulae
interpretation of the IRTG is an algebra of trees, whose
leaves are the individual strings that we use to construct     ¬(DachneigungGenau(𝑛) ∧ DachneigungGenau(𝑚))
rules in a subsequent step. The trees resemble the order       12
                                                                  https://github.com/coli-saar/alto
in which these strings (names and values of attributes,        13
                                                                  https://github.com/blellmann/BRISEprover
modal elements, units of measurement) have been added          14
                                                                  http://subsell.logic.at/bprover/deonticProver/version2.0/




                                                                8
Gabor Recski et al. CEUR Workshop Proceedings                                                                             1–13



 are derivable for 𝑛 ̸= 𝑚, stating that the exact angle of         check once in a preprocessing stage, store for every de-
a roof does not have two different values.                         ontic assumption the list of conflicting ones, and only
   Second, and more significantly, to be more in line with         check that none of the assumptions in this list is appli-
other approaches in the area of deontic reasoning such             cable and more specific during the actual computation.
as [31] as well as for efficiency reasons we modified the          In our experiments this increased efficiency was neces-
mechanism how the prover handles specificity reasoning             sary for reasoning with a non-trivial number of deontic
when reasoning from deontic assumptions. To illustrate,
                                                                   assumptions. To compare the two reasoning methods
assume the deontic assumption
                                                                   the user can switch between the original (“classic”) and
       obl(DachneigungMax(5) ∧ BegruenungDach,                     modified (“modern”) versions on the web interface15 for
                                                         (1)       the prover. For the sake of simplicity the web interface
                                    Plangeb(7181))
                                                                   for the whole pipeline only uses the modified version.
 stating that the maximal angle of the roof must be 5                 Originally, for derivable input deonticProver2.0 out-
degrees and the roof must be green under the condition             puts a pdf file with a derivation in the calculus. How-
that the building is in zone 7181. This would be partially         ever, since the derivations can become rather large (even
overruled by the additional more specific assumption               breaking the maximal limit on object size in TeX) and the
                                                                   average user might not be acquainted with the specific
          obl(¬BegruenungDach, Plangeb(7181)
                                                         (2)       formalism used in the prover, we further extended the
                            ∧Planzeichen(BB1))                     output module with an option to print the derivation as
 stating that roofs in areas of zone 7181 marked with the          an explanation in pseudo-natural language. Explanations
label BB1 on the map must be not green roofs. The latter           can be unfolded step by step by clicking on a button la-
assumption is considered more specific than (1) because            belled “Why?” after the “The statement ... is derivable.”
its condition Plangeb(7181) ∧ Planzeichen(BB1)                     output. In unfolding the explanation, propositional steps
strictly implies the condition Plangeb(7181) of assump-            are skipped by default to reveal the crucial deontic state-
tion (1). In deonticProver2.0 the assumption (1) could             ments and assumed facts used there. These intermediary
still be used to infer obligations from the part of its con-       steps can additionally be unfolded by clicking on a “Why
tent not in conflict with the content of the more specific         does it follow from the above?” button. In the demo the
assumption (2), such as                                            user can select the output format.
       obl(¬DachneigungGenau(7), Plangeb(7181)
                                                                      We stress again that here our prover serves mainly as
                                                         (3)       an example for a possible reasoning mechanism and that
                              ∧Planzeichen(BB1))
                                                                   we do not claim that the underlying logic is necessarily
 stating that in areas of zone 7181 marked with the label          the most appropriate. For this reason we also defer the
BB1 the exact angle of the roof must not be 7 degrees.             theoretical details of the modifications of the underlying
In our prover we changed this behaviour so that any                sequent system to a forthcoming companion paper.
assumption which is a in conflict with a more specific
applicable one cannot be used to derive any obligations.
Thus (disregarding prohibition and permission opera- 5. Evaluation
tors for the sake of exposition) to check whether (3) is
derivable we now check whether there is an assumption The rule systems presented in Sec. 3.2 were developed
obl(𝐴, 𝐵) such that                                             based on a small annoted sample of sentences from
     1. 𝐴 → ¬DachneigungGenau(7) is derivable                   the zoning plan of the City of Vienna. In order to es-
                                                                tablish a representative sample, we started by estimat-
     2. Plangeb(7181) ∧ Planzeichen(BB1) → 𝐵 is deriv-
                                                                ing the distribution of attributes in the entire corpus
        able
                                                                by manually labeling the sentences of 10 randomly se-
     3. there is no applicable and more specific assumption lected documents with the attributes they mention (ei-
        conflicting with obl(𝐴, 𝐵), i.e., there is no obl(𝐶, 𝐷) ther as condition or content). This sample contains 344
        such that
                                                                mentions of attributes in 193 sentences (as well as 118
            a) Plangeb(7181) ∧ Planzeichen(BB1) → 𝐷 sentences without attribute mentions, mostly from the
                and 𝐷 → 𝐵 are derivable
                                                                preambles). The number of unique attributes in the sam-
            b) 𝐶 → ¬𝐴 is derivable
                                                                ple is 84, but 193 of the 344 instances (56%) come from
Crucially, the “no-conflict” check in item (3b) above only the 16 most frequent attributes. We then chose 6 sen-
needs to be performed between two assumptions and tences from this sample that together contain mentions
not between the formula to be proved and an assump- of 7 of these 16 attributes, including the 3 most fre-
tion. This means that instead of checking for conflicts quent ones (GebaeudeHoeheMax, AbschlussDachMax,
many times redundantly in the search for a derivation GebaeudeHoeheArt) that are alone responsible for 17%
(as is done in deonticProver2.0) it suffices to perform this          15
                                                                           http://subsell.logic.at/bprover/briseprover/




                                                               9
Gabor Recski et al. CEUR Workshop Proceedings                                                                        1–13



of all attribute mentions in the larger sample. We anno-       an appropriate translation. Second, our propositional
tated these 6 sentences with the full representation of all    language is currently rather restricted, since we do not
rules stated by them and developed our rule extraction         permit, e.g., disjunctions in the conditions or content
system to achieve perfect performance on this toy corpus.      of an obligation. Again, this could be addressed rather
Both this fully annotated set and the larger sample of         straightforwardly by extending the format of our repre-
10 documents annotated for attribute mentions only are         sentation, possibly along the lines of JsonLogic17 . We also
released along with the software16 . While our method of       do not consider quantification or nested deontic opera-
selecting the sentences for the toy corpus ensures that        tors. For the current application these features seemed
the attribute extraction step of our method has high cov-      not to be necessary. Most of these limitations are in line
erage (recall above 51% with a precision above 93% on          with other current approaches, e.g., [16, 28].
the sample of 10 documents and 344 attribute instances),          The proof-of-concept application presented in this pa-
this cannot be considered as quantitative evaluation of        per can serve as a blueprint for semantics-based solutions
the full rule extraction pipeline. The limited amount          to a wide range of information extraction tasks includ-
of annotated data also does not permit any conclusions         ing variants of entity recognition and relation extraction.
about the effect of errors in syntactic parsing made by        Such systems are generally more flexible, interpretable,
the stanza model, but our assumption that this should not      and less prone to bias than the large neural network mod-
become a bottleneck for such standard text is reinforced       els used for similar tasks. However, to make such systems
by the fact that we did not observe any such errors in our     a viable alternative for everyday NLP applications, novel
sample. A larger-scale annotation of attribute mentions        methods must be devised for the (semi-)automatic learn-
is currently in progress.                                      ing of task-specific rule systems like the one manually
                                                               built for this project. Concerning the automated reason-
                                                               ing part, we plan to consider specifications to different
6. Discussion                                                  frameworks in the future, including those of argumen-
                                                               tation theory [27], I/O logic [28], and defeasible deontic
In this article we have presented a system for extract-
                                                               logic [16], and integrate existing provers for these for-
ing formal rules from legal text using generic semantic
                                                               malisms such as TOAST18 , SPINdle19 or TurnipBox20 . Ad-
parsing and domain-specific pattern-matching, and con-
                                                               ditionally, we plan to implement alternative translations
verting them to deontic logic for use in an automated
                                                               from the generic representation to the language of dyadic
reasoning system. All components of the pipeline, includ-
                                                               deontic logic, corresponding to different interpretations
ing those contributed in this paper, are made available
                                                               of the logical structure of deontic statements. Along the
as open-source software under the MIT license, for un-
                                                               lines of [35] this could be used to compare such differ-
restricted use in future applications. Unlike machine
                                                               ent interpretations. Finally, we would like to investigate
learning based information extraction systems, our rule
                                                               whether the part of our pipeline creating general rule
extraction model is fully explainable and serves as an
                                                               representations could be used in combination with the
example for a specific application of semantic parsing
                                                               NAI suite [17]. Our rule-based approach could be used as
to domain-specific information extraction. While the se-
                                                               a first step to automatically suggest a formalisation of a
mantic representation and parsing algorithms used in
                                                               given legal text, which then could be converted into the
our pipeline are language-agnostic, they may require
                                                               format used in the NAI suite and run through the quality
adaptation to new languages and domains. Furthermore,
                                                               assurance function provided there. The benefit would be
for domains and text genres that more closely resemble
                                                               that the legal experts do not need to actively formulate
everyday language use, deep semantic analysis would
                                                               the formalisation of a legal text, but only to potentially
require lexical inference, a notoriously difficult task in
                                                               adjust it based on the quality assurance checks.
computational semantics [32, 33]. In our general rule
representation we concentrated on deontic statements of
a reasonably simple form. While this form seems to be          Acknowledgments
well adapted to the regulations provided in the texts for
the zoning maps of Vienna, there are some obvious limi-        We are grateful to the three anonymous reviewers for
tations. First, since we always assume the presence of a       their suggestions and for additional references. Work
deontic modality (obligation, prohibition or permission),      supported by BRISE-Vienna (UIA04-081), a European
at the moment we cannot treat constitutive norms [34]          Union Urban Innovative Actions project.
such as “The area marked on the map with the label
BB1 is designated a residential area”. This issue could
be addressed by adding an additional modality “consitu-           17
                                                                     https://jsonlogic.com
tiveNorm” to the general representation together with             18
                                                                     http://toast.arg-tech.org/
                                                                  19
                                                                     http://spindle.data61.csiro.au/spindle/
   16                                                             20
        https://github.com/recski/brise-plandok                      https://turnipbox.netlify.app




                                                          10
Gabor Recski et al. CEUR Workshop Proceedings                                                                        1–13



References                                                         sociation for Computational Linguistics, Online,
                                                                   2020, pp. 40–52. URL: https://www.aclweb.org/
 [1] L. Banarescu, C. Bonial, S. Cai, M. Georgescu,                anthology/2020.conll-shared.4. doi:10.18653/v1/
     K. Griffitt, U. Hermjakob, K. Knight, P. Koehn,               2020.conll-shared.4.
     M. Palmer, N. Schneider, Abstract Meaning Rep-            [8] D. Samuel, M. Straka,           ÚFAL at MRP 2020:
     resentation for sembanking, in: Proceedings of                Permutation-invariant semantic parsing in PERIN,
     the 7th Linguistic Annotation Workshop and Inter-             in: Proceedings of the CoNLL 2020 Shared Task:
     operability with Discourse, Association for Com-              Cross-Framework Meaning Representation Pars-
     putational Linguistics, Sofia, Bulgaria, 2013, pp.            ing, Association for Computational Linguistics, On-
     178–186. URL: https://www.aclweb.org/anthology/               line, 2020, pp. 53–64. URL: https://www.aclweb.org/
     W13-2322.                                                     anthology/2020.conll-shared.5. doi:10.18653/v1/
 [2] C. Lyu, I. Titov, AMR parsing as graph pre-                   2020.conll-shared.5.
     diction with latent alignment, in: Proceedings            [9] A. Kornai, The algebra of lexical semantics, in:
     of the 56th Annual Meeting of the Association                 C. Ebert, G. Jäger, J. Michaelis (Eds.), Proceedings of
     for Computational Linguistics (Volume 1: Long                 the 11th Mathematics of Language Workshop, LNAI
     Papers), Association for Computational Linguis-               6149, Springer, 2010, pp. 174–199. doi:10.5555/
     tics, Melbourne, Australia, 2018, pp. 397–407.                1886644.1886658.
     URL: https://www.aclweb.org/anthology/P18-1037.          [10] G. Recski, Building concept definitions from ex-
     doi:10.18653/v1/P18-1037.                                     planatory dictionaries, International Journal of Lex-
 [3] S. Zhang, X. Ma, K. Duh, B. Van Durme, AMR                    icography 31 (2018) 274–311. doi:10.1093/ijl/
     parsing as sequence-to-graph transduction, in: Pro-           ecx007.
     ceedings of the 57th Annual Meeting of the Associa-      [11] Á. Kovács, K. Gémes, A. Kornai, G. Recski,
     tion for Computational Linguistics, Association for           BMEAUT at SemEval-2020 task 2: Lexical en-
     Computational Linguistics, Florence, Italy, 2019, pp.         tailment with semantic graphs, in: Proceed-
     80–94. URL: https://www.aclweb.org/anthology/                 ings of the Fourteenth Workshop on Semantic
     P19-1009. doi:10.18653/v1/P19-1009.                           Evaluation, International Committee for Compu-
 [4] O. Abend, A. Rappoport, Universal Conceptual                  tational Linguistics, Barcelona (online), 2020, pp.
     Cognitive Annotation (UCCA), in: Proceedings                  135–141. URL: https://www.aclweb.org/anthology/
     of the 51st Annual Meeting of the Association                 2020.semeval-1.15.
     for Computational Linguistics (Volume 1: Long            [12] J. Nivre, M. Abrams, Ž. Agić, L. Ahrenberg, L. An-
     Papers), Association for Computational Linguis-               tonsen, K. Aplonova, M. J. Aranzabe, G. Arutie,
     tics, Sofia, Bulgaria, 2013, pp. 228–238. URL: https:         M. Asahara, L. Ateyah, M. Attia, A. Atutxa, L. Au-
     //www.aclweb.org/anthology/P13-1023.                          gustinus, E. Badmaeva, M. Ballesteros, E. Baner-
 [5] D. Hershcovich, O. Abend, A. Rappoport, A                     jee, S. Bank, V. Barbu Mititelu, V. Basmov, J. Bauer,
     transition-based directed acyclic graph parser for            S. Bellato, K. Bengoetxea, Y. Berzak, I. A. Bhat,
     UCCA, in: Proceedings of the 55th Annual                      R. A. Bhat, E. Biagetti, E. Bick, R. Blokland, V. Bo-
     Meeting of the Association for Computational                  bicev, C. Börstell, C. Bosco, G. Bouma, S. Bow-
     Linguistics (Volume 1: Long Papers), Associa-                 man, A. Boyd, A. Burchardt, M. Candito, B. Caron,
     tion for Computational Linguistics, Vancouver,                G. Caron, G. Cebiroğlu Eryiğit, F. M. Cecchini,
     Canada, 2017, pp. 1127–1138. URL: https://www.                G. G. A. Celano, S. Čéplö, S. Cetin, F. Chalub, J. Choi,
     aclweb.org/anthology/P17-1104. doi:10.18653/                  Y. Cho, J. Chun, S. Cinková, A. Collomb, Ç. Çöl-
     v1/P17-1104.                                                  tekin, M. Connor, M. Courtin, E. Davidson, M.-
 [6] D. Hershcovich, O. Abend, A. Rappoport, Multi-                C. de Marneffe, V. de Paiva, A. Diaz de Ilarraza,
     task parsing across semantic representations, in:             C. Dickerson, P. Dirix, K. Dobrovoljc, T. Dozat,
     Proceedings of the 56th Annual Meeting of the As-             K. Droganova, P. Dwivedi, M. Eli, A. Elkahky,
     sociation for Computational Linguistics (Volume 1:            B. Ephrem, T. Erjavec, A. Etienne, R. Farkas,
     Long Papers), Association for Computational Lin-              H. Fernandez Alcalde, J. Foster, C. Freitas, K. Gaj-
     guistics, Melbourne, Australia, 2018, pp. 373–385.            došová, D. Galbraith, M. Garcia, M. Gärdenfors,
     URL: https://www.aclweb.org/anthology/P18-1035.               S. Garza, K. Gerdes, F. Ginter, I. Goenaga, K. Go-
     doi:10.18653/v1/P18-1035.                                     jenola, M. Gökırmak, Y. Goldberg, X. Gómez Guino-
 [7] H. Ozaki, G. Morio, Y. Koreeda, T. Morishita,                 vart, B. Gonzáles Saavedra, M. Grioni, N. Grūzı̄tis,
     T. Miyoshi,        Hitachi at MRP 2020: Text-                 B. Guillaume, C. Guillot-Barbance, N. Habash, J. Ha-
     to-graph-notation transducer,         in: Proceed-            jič, J. Hajič jr., L. Hà Mỹ, N.-R. Han, K. Harris,
     ings of the CoNLL 2020 Shared Task: Cross-                    D. Haug, B. Hladká, J. Hlaváčová, F. Hociung,
     Framework Meaning Representation Parsing, As-                 P. Hohle, J. Hwang, R. Ion, E. Irimia, O.. Ishola,



                                                         11
Gabor Recski et al. CEUR Workshop Proceedings                                                                              1–13



     T. Jelínek, A. Johannsen, F. Jørgensen, H. Kaşıkara,                ings of the 58th Annual Meeting of the Associa-
     S. Kahane, H. Kanayama, J. Kanerva, B. Katz,                        tion for Computational Linguistics: System Demon-
     T. Kayadelen, J. Kenney, V. Kettnerová, J. Kirchner,                strations, Association for Computational Linguis-
     K. Kopacewicz, N. Kotsyba, S. Krek, S. Kwak, V. Laip-               tics, Online, 2020, pp. 101–108. URL: https://www.
     pala, L. Lambertino, L. Lam, T. Lando, S. D. Larasati,              aclweb.org/anthology/2020.acl-demos.14. doi:10.
     A. Lavrentiev, J. Lee, P. Lê Hồng, A. Lenci, S. Lertpra-           18653/v1/2020.acl-demos.14.
     dit, H. Leung, C. Y. Li, J. Li, K. Li, K. Lim, N. Ljubešić,    [14] A. Koller, Semantic construction with graph gram-
     O. Loginova, O. Lyashevskaya, T. Lynn, V. Macke-                    mars, in: Proceedings of the 11th International
     tanz, A. Makazhanov, M. Mandl, C. Manning, R. Ma-                   Conference on Computational Semantics, Associ-
     nurung, C. Mărănduc, D. Mareček, K. Marheinecke,                    ation for Computational Linguistics, London, UK,
     H. Martínez Alonso, A. Martins, J. Mašek, Y. Mat-                   2015, pp. 228–238. URL: https://www.aclweb.org/
     sumoto, R. McDonald, G. Mendonça, N. Miekka,                        anthology/W15-0127.
     M. Misirpashayeva, A. Missilä, C. Mititelu, Y. Miyao,          [15] M. Sergot, F. Sadri, R. Kowalski, F. Kriwaczek,
     S. Montemagni, A. More, L. Moreno Romero, K. S.                     P. Hammond, H. Cory, The British Nationality Act
     Mori, S. Mori, B. Mortensen, B. Moskalevskyi,                       as a logic program, Communications of the ACM
     K. Muischnek, Y. Murawaki, K. Müürisep, P. Nain-                    29 (1986).
     wani, J. I. Navarro Horñiacek, A. Nedoluzhko,                  [16] G. Governatori, Practical normative reasoning
     G. Nešpore-Bērzkalne, L. Nguyễn Thi., H. Nguyễn                  with defeasible deontic logic, in: C. d’Amato,
     Thi. Minh, V. Nikolaev, R. Nitisaroj, H. Nurmi,                     M. Theobald (Eds.), Reasoning Web 2018, volume
     S. Ojala, M. Olúòkun, Adédayo.and Omura, P. Osen-                   11078 of LNCS, Springer, 2018, pp. 1–25.
     ova, R. Östling, L. Øvrelid, N. Partanen, E. Pas-              [17] T. Libal, A. Steen, NAI: Towards transparent and
     cual, M. Passarotti, A. Patejuk, G. Paulino-                        usable semi-automated legal analysis, in: IRIS 2020,
     Passos, S. Peng, C.-A. Perez, G. Perrier, S. Petrov,                Editions Weblaw, 2020, pp. 265–272.
     J. Piitulainen, E. Pitler, B. Plank, T. Poibeau,               [18] S. Batsakis, G. Baryannis, G. Governatori, I. Tach-
     M. Popel, L. Pretkalnin, a, S. Prévost, P. Proko-                   mazidis, G. Antoniou, Legal representation and
     pidis, A. Przepiórkowski, T. Puolakainen, S. Pyysalo,               reasoning in practice: A critical comparison, in:
     A. Rääbis, A. Rademaker, L. Ramasamy, T. Rama,                      JURIX 2018, IOS Press, 2018, pp. 31–40. URL: https:
     C. Ramisch, V. Ravishankar, L. Real, S. Reddy,                      //doi.org/10.3233/978-1-61499-935-5-31.
     G. Rehm, M. Rießler, L. Rinaldi, L. Rituma, L. Rocha,          [19] R. Calegari, G. Contissa, F. Lagioia, A. Omicini,
     M. Romanenko, R. Rosa, D. Rovati, V. Ros, ca,                       G. Sartor, Defeasible systems in legal reasoning: A
     O. Rudina, J. Rueter, S. Sadde, B. Sagot, S. Saleh,                 comparative assessment, in: JURIX 2019, IOS Press,
     T. Samardžić, S. Samson, M. Sanguinetti, B. Saulı̄te,               2019, pp. 169–174. doi:10.3233/FAIA190320.
     Y. Sawanakunanon, N. Schneider, S. Schuster,                   [20] T. Libal,      A meta-level annotation language
     D. Seddah, W. Seeker, M. Seraji, M. Shen, A. Shi-                   for legal texts,       in: M. Dastani, H. Dong,
     mada, M. Shohibussirri, D. Sichinava, N. Silveira,                  L. van der Torre (Eds.), CLAR 2020, volume 12061 of
     M. Simi, R. Simionescu, K. Simkó, M. Šimková,                       LNCS, Springer, 2020, pp. 131–150. doi:10.1007/
     K. Simov, A. Smith, I. Soares-Bastos, C. Spadine,                   978-3-030-44638-3_9.
     A. Stella, M. Straka, J. Strnadová, A. Suhr, U. Su-            [21] A. Ciabattoni, B. Lellmann, Sequent rules for rea-
     lubacak, Z. Szántó, D. Taji, Y. Takahashi, T. Tanaka,               soning and conflict resolution in conditional norms,
     I. Tellier, T. Trosterud, A. Trukhina, R. Tsarfaty,                 in: F. Liu, A. Marra, P. Portner, F. V. D. Putte (Eds.),
     F. Tyers, S. Uematsu, Z. Urešová, L. Uria, H. Uszko-                DEON 2020/2021, College Publications, 2021.
     reit, S. Vajjala, D. van Niekerk, G. van Noord,                [22] D. Merigoux, N. Chataing, J. Protzenko, Catala:
     V. Varga, E. Villemonte de la Clergerie, V. Vincze,                 A programming language for the law, CoRR
     L. Wallin, J. X. Wang, J. N. Washington, S. Williams,               abs/2103.03198 (2021). URL: https://arxiv.org/abs/
     M. Wirén, T. Woldemariam, T.-s. Wong, C. Yan,                       2103.03198. arXiv:2103.03198.
     M. M. Yavrumyan, Z. Yu, Z. Žabokrtský, A. Zeldes,              [23] N. O. Nawari, A generalized adaptive frame-
     D. Zeman, M. Zhang, H. Zhu, Universal depen-                        work (GAF) for automating code compliance
     dencies 2.3, 2018. URL: http://hdl.handle.net/11234/                checking,       Buildings 9 (2019). URL: https://
     1-2895, LINDAT/CLARIN digital library at the In-                    www.mdpi.com/2075-5309/9/4/86. doi:10.3390/
     stitute of Formal and Applied Linguistics (ÚFAL),                   buildings9040086.
     Faculty of Mathematics and Physics, Charles Uni-               [24] A. Sleimi, N. Sannier, M. Sabetzadeh, L. C. Briand,
     versity.                                                            M. Ceci, J. Dann, An automated framework for the
[13] P. Qi, Y. Zhang, Y. Zhang, J. Bolton, C. D. Man-                    extraction of semantic legal metadata from legal
     ning, Stanza: A python natural language processing                  texts, Empirical Software Engineering 26 (2021) 43.
     toolkit for many human languages, in: Proceed-                      doi:10.1007/s10664-020-09933-5.



                                                               12
Gabor Recski et al. CEUR Workshop Proceedings                 1–13



[25] J. Morris, Blawx: Rules as code demonstration, MIT
     Computational Law Report (2020). URL: https://law.
     mit.edu/pub/blawxrulesascodedemonstration.
[26] J. Zhang, N. M. El-Gohary, Automated information
     transformation for automated regulatory compli-
     ance checking in construction, Journal of Com-
     puting in Civil Engineering 29 (2015) B4015001.
     doi:10.1061/(ASCE)CP.1943-5487.0000427.
[27] S. Modgil, H. Prakken, The ASPIC+ framework for
     structured argumentation: A tutorial, Argument
     and Computation 5 (2014) 31–62. URL: http://dx.doi.
     org/10.1080/19462166.2013.869766.
[28] X. Parent, L. van der Torre, Input/output logic, in:
     D. Gabbay, J. Horty, X. Parent, R. van der Meyden,
     L. van der Torre (Eds.), Handbook of Deontic Logic
     and Normative Systems, College Publications, 2013,
     pp. 495–544.
[29] A. Koller, M. Kuhlmann, A generalized view on
     parsing and translation, in: Proceedings of the 12th
     International Conference on Parsing Technologies,
     Association for Computational Linguistics, Dublin,
     Ireland, 2011, pp. 2–13. URL: https://www.aclweb.
     org/anthology/W11-2902.
[30] R. Reiter, A logic for default reasoning, Artificial
     Intelligence (1980).
[31] J. F. Horty, Reasons as Defaults, Oxford University
     Press, 2012.
[32] A. Talman, S. Chatzikyriakidis, Testing the gener-
     alization power of neural network models across
     NLI benchmarks, in: Proceedings of the 2019
     ACL Workshop BlackboxNLP: Analyzing and Inter-
     preting Neural Networks for NLP, Association for
     Computational Linguistics, Florence, Italy, 2019, pp.
     85–94. URL: https://www.aclweb.org/anthology/
     W19-4810. doi:10.18653/v1/W19-4810.
[33] M. Schmitt, H. Schütze, Language models for lexical
     inference in context, in: Proceedings of the 16th
     Conference of the European Chapter of the Associ-
     ation for Computational Linguistics: Main Volume,
     Association for Computational Linguistics, Online,
     2021, pp. 1267–1280. URL: https://www.aclweb.org/
     anthology/2021.eacl-main.108.
[34] G. Boella, L. W. N. van der Torre, Regulative and
     constitutive norms in normative multiagent sys-
     tems, in: D. Dubois, C. A. Welty, M. Williams (Eds.),
     KR2004, AAAI Press, 2004, pp. 255–266. URL: http:
     //www.aaai.org/Library/KR/2004/kr04-028.php.
[35] B. Lellmann, F. Gulisano, A. Ciabattoni,
     Mı̄mām . sā deontic reasoning using speci-
     ficity: A proof theoretic approach, Artificial
     Intelligence and Law (published online 2020).
     doi:10.1007/s10506-020-09278-w.




                                                         13