=Paper= {{Paper |id=Vol-2888/paper3 |storemode=property |title=Explainable Rule Extraction via Semantic Graphs |pdfUrl=https://ceur-ws.org/Vol-2888/paper3.pdf |volume=Vol-2888 |authors=Gabor Recski,Björn Lellmann,Adam Kovacs,Allan Hanbury |dblpUrl=https://dblp.org/rec/conf/icail/RecskiLKH21 }} ==Explainable Rule Extraction via Semantic Graphs== https://ceur-ws.org/Vol-2888/paper3.pdf

Explainable Rule Extraction via Semantic Graphs
Gabor Recski1 , Björn Lellmann2,3 , Adam Kovacs1,4 and Allan Hanbury1
1
TU Wien, Vienna, Austria
2
SBA Research, Vienna, Austria
3
Federal Ministry for Digital and Economic Affairs, Vienna, Austria (Since May 3, 2021)
4
Budapest University of Technology and Economics, Budapest, Hungary

Abstract
We present an end-to-end system for extracting deontic logic formulae from legal text using a generic semantic parsing
module and task-specific graph grammars, and for performing automated reasoning on the extracted formulae. The pipeline
enables automated compliance checking and is applied to text documents of the zoning map of the city of Vienna. All
components are released as open-source software, the full pipeline is showcased in an online demo.

Keywords
semantic parsing, information extraction, automated reasoning, deontic logic

1. Introduction entirely rule-based, making our system an example of
true explainable AI (XAI). Unlike in deep learning-based
We present an end-to-end system for extracting deontic information extraction systems, extracted rules can be
logic formulae from legal text using a generic semantic directly traced back to text patterns, making it straight-
parsing module and task-specific graph grammars, and forward to provide natural language explanations for
for performing automated reasoning on the extracted decisions made based on them. This explainable nature
formulae. An overview of the pipeline is shown in Fig. 1. also enables human-in-the-loop operation and provides
Plain text regulations are processed by a pipeline of safeguards against biased decision-making. Our main
domain-agnostic language processing tools, including a contributions are:
system for building syntax-independent concept graphs
that represent the meaning of each sentence. These • Specification of a formal representation of deontic
graphs serve as the input for a task-specific rule extrac- statements including those of the construction
tion module that maps them to deontic logic formulae, regulation domain for automated rule extraction
which in turn are used in an automated reasoning system. and reasoning
The proposed pipeline is applied to text documents of the
• A preprocessed, structured corpus of sentences
zoning map of the city of Vienna1 , an exciting corpus of
extracted from the zoning plan of the City of Vi-
legal regulations whose highly structured nature renders
enna, a small subset of which is annotated manu-
it very well suited for formal approaches. In absence of a
ally with formal rule representations
large-scale annotated corpus we evaluate our approach
on a toy dataset of manually analyzed sentences that • A grammar-based system for explainable rule ex-
were selected to cover the most frequent attributes in traction from semantic graphs, evaluated on the
the full dataset. Possible applications include automated annotated corpus
compliance checking and question answering. The se-
mantic parser and rule extractor components are both • The adaption and extension of a general theorem
prover to the reasoning domain, including natural
Proceedings of the Fifth Workshop on Automated Semantic Analysis language output
of Information in Legal Text (ASAIL 2021), June 25, 2021, São Paulo,
Brazil. • System architecture and working prototype for
" gabor.recski@tuwien.ac.at (G. Recski); lellmann@logic.at
an end-to-end system for rule extraction and au-
(B. Lellmann); adam.kovacs@tuwien.ac.at (A. Kovacs);
allan.hanbury@tuwien.ac.at (A. Hanbury) tomated reasoning from raw text
0000-0001-5551-3100 (G. Recski); 0000-0002-5335-1838
(B. Lellmann); 0000-0001-6132-7144 (A. Kovacs); The paper is structured as follows. In Sec. 2 we re-
0000-0002-7149-5843 (A. Hanbury) view recent work on semantic parsing, automatic rule
© 2021 Copyright for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0). extraction, and automated reasoning for legal-tech ap-
CEUR

CEUR Workshop Proceedings (CEUR-WS.org)
Workshop
Proceedings
http://ceur-ws.org
ISSN 1613-0073

plications, and present the dependencies of our pipeline:
1
See https://www.wien.gv.at/flaechenwidmung/public/ a task-independent semantic parser and a deontic logic
for the map and https://www.data.gv.at/katalog/dataset/
flachenwidmungs-und-bebauungsplan-plandokumente-wien prover. Sec. 3 presents the rule extraction method, Sec. 4
for how to obtain the text documents (in German). describes the architecture of the full pipeline. Sec. 5

1
Gabor Recski et al. CEUR Workshop Proceedings 1–13

PDF documents Text segmentation Sentences Semantic parsing Concept graphs Rule extraction Deontic rules Reasoning

Figure 1: Overview of our pipeline. The solid rectangle indicates the newly contributed component and format, dashed
rectangles mark existing tools that we modify or extend.

provides preliminary evaluation of our rule extraction representations in NLP pipelines. Abstract Meaning Rep-
method, Sec. 6 discusses next steps and draws some con- resentations (AMRs) [1] represent sentence meaning as
clusions. All components of our system are available directed graphs of words, but do not provide a model
as open-source software2 . The example application is of word meaning and are highly English-specific, not
showcased in an online demo3 . intended as a language-independent framework of mean-
ing representation. The task of parsing raw text to AMR
graphs has recently attracted growing interest and is usu-
2. Related work ally performed using deep neural networks [2, 3] trained
on annotated corpora, also called sembanks. Universal
The related work for the two main components of the sys-
Conceptual Cognitive Annotation (UCCA) [4] takes a
tem is presented: (i) semantic parsing for extracting logic
language-agnostic approach to meaning representation,
expressions directly from legal text, and (ii) deontic logic
modeling sentence meaning with directed acyclic graphs
for performing automated reasoning on the extracted
(DAGs) representing scenes evoked by predicates. Top
formulae.
UCCA parsers also rely on manually annotated corpora
and neural networks [5, 6, 7, 8]. Since these frameworks
2.1. Semantic parsing do not provide a generic parsing algorithm, building
Semantic parsing is the task of automatically mapping representations for a new language and/or new domain
natural language text to a formal representation of its would require the manual compilation of large annotated
meaning. Most contemporary architectures for solving datasets that could be used to train end-to-end machine
information extraction tasks do not perform semantic learning models. For the pipeline presented in this paper
parsing, instead relying on models which directly en- we choose the language-independent 4lang framework
code the correspondence between natural language text [9], for which a robust parsing method [10] with an open-
and some set of task-specific structures such as labels, source implementation4 [11] is also available.
sequences, attribute-value structures, etc. These mod- The 4lang framework represents the meaning of both
els are primarily built using machine learning methods, words and larger units like phrases and sentences as di-
whose performance is dependent on the quality and quan- rected graphs of concepts. The representation is syntax-
tity of available training data and whose decisions are independent, concepts do not have types such as part-of-
difficult to interpret and prone to bias. Rule-based mod- speech or even as predicate and argument. A key feature
els, on the other hand, require considerable expert effort of 4lang graphs that enables uniform treatment of syn-
0
to build and maintain and can still be difficult to adapt tactically different constructions is the 0-relation (−→), a
to changes in the task definition. The architecture we single representation for a range of closely related seman-
0
propose tackles the information extraction task in two tic relationships such as the ISA relationship (e.g. roof −
→
steps: the mapping of natural language text to task- and covering), attribution (e.g. sidewalk −
0
→ paved), and pred-
domain-independent meaning representations (semantic ication (e.g. platform −
0
→ extend). 4lang graphs can be
parsing) followed by a task-specific information extrac- built from raw text automatically using a rule-based sys-
tion step that operates on these representations. In this tem that uses Universal Dependencies (UD) [12] as an
section we give a brief overview of common approaches intermediate step. UD trees encode grammatical relations
to semantic parsing and of the representation framework (dependencies) between pairs of words in a sentence — an
used by our pipeline. example of such an analysis is shown in Fig. 2, described
Unlike other forms of automated linguistic annotation later. UD parsers are available for dozens of languages,
such as part-of-speech tagging or syntactic parsing, se- in this pipeline we use the stanza package5 [13]. The
mantic parsing is not in any way standardized in the transformation of UD trees into 4lang graphs is based
natural language processing (NLP) community. Even on a small set of rules described in [10] and implemented
those few frameworks that have recently attracted grow- by [11] as parsing and decoding of an Interpreted Regu-
ing interest in the task are rarely used as intermediate lar Tree Grammar (IRTG) [14], a formalism that we also
use in this work for implementing the rule extraction
2 4
https://github.com/recski/brise-plandok https://github.com/adaamko/wikt2def
3 5
https://ir-group.ec.tuwien.ac.at/brise-extract https://stanfordnlp.github.io/stanza

2
Gabor Recski et al. CEUR Workshop Proceedings 1–13

mechanism that maps 4lang semantic graphs to trees of minimal height of 10 metres. Due to this property the rea-
attributes (see Sec. 3.2). Most rules map a single UD edge soning process is non-monotone: While from the single
between two content words to 4lang edges connecting assumption obl(buildingHeightMin(10), ⊤) we can
the corresponding concepts (e.g., in the example in Fig. 2, derive the formula
the relations amod and nsubj : pass are mapped to a
obl(¬buildingHeightExactly(8), facingStreet),
0-edge and a 2-edge, respectively).
stating that the height of buildings facing the street
2.2. Automated reasoning and deontic must not be exactly 8 metres, we cannot derive
logic the same formula with the additional assumption
obl(buildingHeightMax(9), facingStreet) any-
The investigation of automated reasoning methods in the more. As a second modification from the original
legal domain has a long history, see, e.g., the seminal [15]. deonticProver2.0, here we consider a different conflict
More recently, automated reasoning methods have been resolution mechanism, resulting in particular in higher
considered for legal texts or company regulations [16, 17], efficiency of the implementation. Reasoning in this
and a large number of reasoning frameworks and tools logic is implemented via backwards proof search in a
are available, see, e.g., [18, 19] for an overview. Follow- sequent system with underivability statements, both in
ing the approach in [17], here we consider a general and the system from [21] and in a slight modified reasoning
formalism-independent representation of the regulations, engine. See op. cit. for the details of the original system
which can be translated into different frameworks. For- and Sec. 4.4 for the modifications.
tunately, the structure of the intended application, the We are thankful to one of the reviewers for bringing ad-
regulations of the zoning map of Vienna, is relatively ditional relevant literature to our attention [22, 23, 24, 25,
clear, and mostly does not require advanced features of 26]. Unfortunately, space and time constraints prevented
the formal language such as nested deontic operators in a detailed comparison for the final version.
the assumptions [16] or macros [20].
The specific reasoning engine used in this paper, the
theorem prover BRISEprover6 , is an extension and modi- 3. Rule extraction
fication of the theorem prover deonticProver2.07 devel-
The pipeline presented here takes as its input raw text
oped in [21] for reasoning with assumptions in dyadic de-
documents containing regulations of the zoning map of
ontic logic. In this logical framework, propositional logic
the City of Vienna, builds representations of their mean-
is extended with dyadic deontic operators obl, for and per.
ing using the 4lang system (see Sec. 2.1), uses the re-
Formulae obl(𝐴, 𝐵), for(𝐴, 𝐵) and per(𝐴, 𝐵) are read
sulting semantic graphs to extract the legal content of
as “it is obligatory that 𝐴 given 𝐵”, “it is forbidden that
regulations and makes them available to the prover (see
𝐴 given 𝐵” and “it is permitted that 𝐴 given 𝐵”, respec-
Sec. 2.2), which verifies whether some statement is deriv-
tively. As the first extension considered here we extend
able given a set of assumptions. The full architecture is
the language with predicate symbols to capture proper-
described in Sec. 4, we now present the novel rule extrac-
ties, e.g., “building height at most 9 metres”, in atomic for-
tion component and its interfaces to semantic parsing
mulae, e.g., buildingHeightMax(9). Note that it would
and automated reasoning.
be straightforward to include an additional argument rep-
resenting the subject, i.e., which building has a height of
at most 9 metres. Since in our case this is clear from the 3.1. Representation
context we simplify the representation by assuming the Since we do not want to commit to modeling regu-
subject is always the same. The prover decides derivabil- lations in a particular formalism, and to facilitate the
ity from a set of factual and deontic assumptions, i.e., non- integration of different reasoning engines we first con-
deontic and non-nested deontic formulae respectively. vert the legal content of the regulations into a generic
The reasoning engine supports the specificity principle representation. For this we assume that a deontic regula-
in the form that more specific deontic assumptions over- tion, i.e., a regulation stating an obligation, prohibition
ride less specific conflicting ones. E.g., the assumption or permission, is comprised of the following parts:
obl(buildingHeightMax(9), facingStreet), stating
that buildings facing the street must have a maximal • Modality: This states whether the regulation is
height of 9 metres, overrides the less specific assump- an obligation, a prohibition or a permission;
tion obl(buildingHeightMin(10), ⊤), stating that un-
• Content: The content of the regulation, i.e., what
der the always true condition ⊤ buildings must have a
is obligatory / prohibited / permitted;
6
See http://subsell.logic.at/bprover/briseprover/ • Conditions: The conditions of the regulation stat-
7
See http://subsell.logic.at/bprover/deonticProver/version2.0/ ing when the regulation applies;

3
Gabor Recski et al. CEUR Workshop Proceedings 1–13

nsubj:pass
Dachneigung
aux:pass
obl
case nmod nmod

case case case det

det nummod det amod mark

NOUN ADP ADP DET NOUN ADP NUM NOUN AUX ADP DET NOUN DET ADJ NOUN PART VERB
Flachdächer bis zu einer Dachneigung von fünf Grad sind entsprechend dem Stand der technischen Wissenschaften zu begrünen

{"modality": "obligation",
"attributes": [
{"type": "content",
"name": "BegruenungDach",
"value": null},

{"type": "condition",
"name": "Dachart",
"value": "Flachdach"},

{"type": "condition",
"name": "DachneigungMax",
"value": "5Grad"}}

Figure 2: Universal Dependency analysis, 4lang semantic graph, and formal rule representation for the sentence Flachdächer
bis zu einer Dachneigung von fünf Grad sind entsprechend dem Stand der technischen Wissenschaften zu begrünen. ‘Flat roofs
with a pitch not exceeding 5 degrees must be greened using state of the art technologies.’

• ConditionExceptions: Possible exceptions to the As an example, the generic representation of the de-
conditions, stating when the regulation does not ontic regulation “Flat roofs should be green roofs unless
apply. E.g., in the regulation “Flat roofs should they are glass roofs” is given by:
be green roofs, unless they are glass roofs”, the { ”modality” : ”obligation”,
“glass roofs” is an exception to the condition. ”attributes” : [
{ ”name” : ”roofType”, ”value” : ”flatRoof”,
• ContentExceptions: Possible exceptions to the
”type” : ”condition”},
content. E.g., in the regulation “Windows are
{ ”name” : ”greenRoof”, ”value” : NIL,
prohibited except for portholes” the “portholes”
”type” : ”content”},
are an exception to the content.
{ ”name” : ”roofType”, ”value” : ”glassRoof”,
This level of granularity seems to capture the necessary ”type” : ”conditionException”}]}
details of the sentences found in the documents from the Here we modelled both flat roofs and glass roofs as roof
city of Vienna zoning map while being flexible enough to types, while modelling the property of being a green
permit translation into different frameworks like dyadic
roof as an atomic propositional statement because the
deontic logic, defeasible deontic logic [16], argumenta-
tion based approaches [27] or input output logic [28]. latter in the documents corresponds to a more complex
Concretely, we represent this structure as a JSON object proposition.
Note that in contrast to, e.g., the approach in [17] at
{”modality” : Modality, ”attributes” : List} this stage we do not commit to a particular modelling of
exceptions by negation-as-failure or negated conditions.
where the key “modality” takes one of the values “obli- This retains the flexibility of the general format necessary
gation”, “prohibition”, “permission”, and where List is
for subsequent specification into a large number of dif-
an array containing attributes of the following form:
ferent formalisms. There are of course some limitations
{”name” : Name, ”value” : Value, ”type” : Type}. inherent in this representation, in particular we do not
model negation. See Sec. 6 for a more detailed discussion.
Here Name is the name of the attribute, Value is its value,
and Type is one of “Content”, “Condition”, “ConditionEx-
3.2. Extraction
ception”, “ContentException”. We obtained the attribute
names via collaboration with domain experts from the The mapping from semantic graphs to the rules described
City of Vienna Baupolizei and verified that the structure is implemented in two steps. An IRTG grammar similar
is appropriate by manually annotating several hundred to that of the semantic parsing system (see Sec. 2.1) is
sentences from the documents of the zoning map. used to extract all attributes, values, and expressions of

4
Gabor Recski et al. CEUR Workshop Proceedings 1–13

modality that occur within a single sentence. A simple
heuristic then matches these elements with each other in
order to create the generic rule representations described
in the previous section. The mapping we establish is
between patterns in generic semantic graphs and formal
rules. This two-step approach is an arbitrary simplifica-
tion that makes implementation simpler and more flexi- ⇕
ble. We shall now describe the approach using some ex-
amples and point out some of its current limitations. The
first component of our rule extraction system is an IRTG OBL

grammar mapping 4lang graphs to lists of strings rep-
resenting attribute names (e.g. DachneigungMax ‘max- BegruenungDach

imal roof pitch’), modalities (e.g. OBL ‘obligation’), as
well as numbers and units of measurement that may be
interpreted as attribute values (e.g. 5 and m). We shall Dachart v_Flachdach DachneigungMax
refer to this grammar as fl_to_attr. The heuristics
for matching values and modalities to attribute names, q_Grad v_5

described later in this section, can only disambiguate
Figure 3: Example mapping implemented by the
between multiple solutions if it is informed about the fl_to_attr grammar for the sentence in Figure 2.
positions of patterns relative to each other. Hence we
use a Tree Grammar as the output interpretation, which
E -> a_gebaeudehoehe_maximal(E, E) [100]
allows us to represent these strings as leaves of a tree [fl] f_src(f_tgt(merge(r_src(?1), merge(r_tgt(?2),
that resembles the order in which they were recognized, "(u / Gebaeudehoehe :0 (v / maximal))"))))
corresponding to steps of composing the 4lang graph [attr] *(*(?1, ?2), *("OBL", "GebaeudeHoeheMax"))
from subgraphs. An example of this mapping is presented
in Fig. 3. Each IRTG rule encoding the correspondence Figure 4: Example rule of the fl_to_attr IRTG grammar.
between a 4lang subgraph and an attribute or tree of The first interpretation line is wrapped for readability, the op-
attributes is a mapping between rule applications in an erations are explained in the text.
s-graph algebra [29] and a tree algebra. S-graphs are
graphs whose nodes may be marked by special labels
called sources and the s-graph algebra’s core operation tation by heuristically matching modalities as well as po-
merge creates new s-graphs by taking the union of its tential attribute values to attribute names. We illustrate
arguments but merging nodes that have the same source. this process using an example with multiple attributes;
We now illustrate this mechanism with a simple exam- consider the following sentence fragment, a subordinate
ple, for a more detailed introduction to s-graph algebras clause of a longer regulation: bei einer Straßenbreite ab 10
and their application to semantic parsing the reader is m entlang der Fluchtlinien Gehsteige mit einer Breite von
referred to [29]. The IRTG rule presented in Fig. 4 de- mindestens 2,0 m herzustellen sind. ‘in case of a street
fines two binary operations, to be performed in parallel width of 10 m or more, sidewalks with a width of at least
on corresponding pairs of s-graphs and trees. The op- 2.0 m are to be constructed along the alignment lines.’.
eration of the 4lang interpretation merges two graphs The pipeline described so far will extract from this sen-
along their root nodes with the two nodes of the edge tence three attribute names (StrassenbreiteMin ‘min-
Gebäudehöhe −
0
→ maximal to create a single graph, while imum street width’, GehsteigbreiteMin ‘minimum
the operation of the attr interpretation merges the two sidewalk width’, AnFluchtlinie ‘along the alignment
attribute trees with each other and subsequently with a line’), two numbers (10, 2.0), two occurrences of the unit
tree of two nodes (OBL, GebaeudeHoeheMax). S-graph of measurement m, and the modality OBL ‘obligation’ (the
algebras use three types of operations: the merge opera- latter based on the word form herzustellen composed of
tion merges two graphs on nodes with matching sources the verb herstellen ‘construct, produce’ and the infinitive
while the rename and forget operations can be used to marker zu). These eight elements are organized in a tree
change or delete sources. In the example rule in Fig. 4 structure according to the order in which they appeared
the two argument graphs are renamed so that their root in the IRTG derivation of the corresponding semantic
sources become src and tgt, and these sources need to graph, shown in Fig. 5. The tree of attributes is stored
be deleted after the merge operation. in a custom data structure that in every node stores the
Once all relevant strings have been extracted from the length of the shortest path between any pair of attributes
semantic graph, the next step is to build the rule represen- below that node. This means that by querying the root of
the tree we can retrieve for any attribute a list of all other

5
Gabor Recski et al. CEUR Workshop Proceedings 1–13

These simple heuristics, which are already capable
of correctly matching attributes to their values and for
AnFluchtLinie
distinguishing between the roles each attribute plays in
a rule, allow us to keep the grammar simpler than it
OBL
would be if it was to directly generate structures like our
GehsteigbreiteMin
generic rule representations. The IRTG rules currently
used (and exemplified in Fig. 4) simply represent one-
to-one correspondences between 4lang subgraphs and
strings, but the underlying Regular Tree Grammar only
StrassenbreiteMin m 2.0 uses a single nonterminal symbol, i.e. rules are not sen-
sitive to which other rules were used to construct their
m 10 arguments. It would be quite straightforward to intro-
duce non-terminal symbols representing attribute names,
numbers, measurement units, and modalities, so that the
IRTG itself would enforce the structure that is currently
built in a postprocessing step. The tree in Figure 5 re-
Figure 5: Example of attribute matching for the sentence
sembles the order in which patterns corresponding to
. . . bei einer Straßenbreite ab 10 m entlang der Fluchtlinien
Gehsteige mit einer Breite von mindestens 2,0 m herzustellen each element were found in the semantic graph, which
sind. ‘in case of a street width of 10 m or more, sidewalks in turn correspond to disjoint subgraphs of the semantic
with a width of at least 2.0 m are to be constructed along the representation that are each connected to the concept
alignment lines.’ herstellen and roughly correspond to fragments of the
original sentence such as Gehsteig mit einer Breite von
mindenstens 2,0 m herstellen ‘construct sidewalk with a
width of at least 2.0 m’, bei einer Straßenbreite ab 10 m
attributes ranked by their relative distance in the tree. We
‘with a road width of at least 10 m’, entlang der Fluchtlin-
first match all units of measurement to the nearest value
ien ‘along the alignment lines’, etc. Here we limit our
in the tree, allowing each value to be associated with at
grammar to the task of understanding each of these pat-
most one unit of measurement. Next, all non-boolean at-
terns independently because this proves sufficient for our
tributes are matched to the nearest value, using a greedy
purposes of constructing formal rules from sentences of
algorithm: all possible attribute-value pairs are sorted
the zoning map of the City of Vienna, which tend to be
by their relative distance in the tree, the pair with the
in a one-to-one correspondence with rules of the general
shortest path is stored as a match, its members are re-
structure described in Sec. 3.1. I.e., we greatly simplify
moved from the lists of attributes and values that are still
our task by exploiting the fact that authors of this piece of
to be matched, and this step is repeated until at least one
legislation rarely express a rule in multiple sentences or
of the lists becomes empty. For example, in Fig. 5 the
incorporate several rules in a single sentence. A notable
attribute StrassenbreiteMin ‘minimum road width’ is
exception is when some conditions such as the ID or des-
paired with the value 10, since they are the closest of
ignation of the areas that a rule refers to are not repeated
any pairs of attribute and value in the tree (the attribute
in every sentence within the same section. These condi-
AnFluchtLinie is excluded from this process because it
tions are propagated by a simple inheritence mechanism
is listed as a boolean attribute). In the second step the at-
that assumes the values of such attributes to hold within
tribute GehsteigbreiteMin ‘minimum sidewalk width’
a single section (see Sec. 4 for how section boundaries
is matched with the only remaining value, 2.0. Finally,
are detected).
the type of each attribute must be detected, i.e. it must be
determined whether an attribute is a condition of the
rule, part of the content, or an exception to either one of 3.3. Specification to dyadic deontic logic
these (contentException, conditionException, see The extracted rules in the generic format are then trans-
Sec. 3.1). Some attrbutes are explicitly listed as always be- lated into the language of dyadic deontic logic. We chose
ing of type condition, e.g., Planzeichen and Widmung this particular framework because the sentences of the
which refer to the ID and designation of an area. Next, the zoning map of Vienna exhibit a very clear deontic struc-
extracted modality elements OBL, FOR, EXC are matched ture, in contrast, e.g., to the largely definitional character
to the nearest of the remaining attributes, which are of the British Nationality Act investigated in [15]. The
in turn determined to be part of the content (in case translation is dependent on the modality in the following
of FOR and OBL) or a conditionException (in case of way. Given a rule representation
EXC). Finally, all remaining attributes are given the type {”modality” : Modality, ”attributes” : List}
condition.

6
Gabor Recski et al. CEUR Workshop Proceedings 1–13

let 4.1. Preprocessing and segmentation
Cnd := cnd1 (cndV1 ) ∧ · · · ∧ cnd𝑛 (cndV𝑛 ) The input to our pipeline consists of PDF documents
CntEx := ctEx1 (ctExV1 ) ∨ · · · ∨ ctEx𝑚 (ctExV𝑚 ) downloaded from the public website of the City of Vi-
CndEx := cdEx1 (cdExV1 ) ∨ · · · ∨ cdExℓ (cdExVℓ ) enna. Each PDF document contains regulations pertain-
ing to one zoning area (Plangebiet), indicated by a four-
where the cnd𝑖 are all the attributes occurring with type digit ID. We discard the fraction of documents that are
“condition” in List, and the cndV𝑖 are their respective scanned images of printed documents and do not con-
values, and similarly for ctEx for type “contentException” tain machine-readable text data (253/1431 = 17.7%) —
and cdEx for type “conditionException”. For the content, we could include these in our experiments by running
let further
optical character recognition (OCR). PDF documents are
Cnt := cnt1 (cntV1 ) ∧ · · · ∧ cnt𝑚 (cntV𝑚 ) then converted to plain text using the pdftotext util-
Cntfor := cnt1 (cntV1 ) ∨ · · · ∨ cnt𝑚 (cntV𝑚 ) ity, part of the open-source Poppler library10 . We use
the −layout option of the tool to maintain page layout
Where again the cnt𝑖 are the attributes of type “content” in the output text file, this greatly simplifies the subse-
and the cntV𝑖 their respective values. The translation of quent extraction of document structure. Next we use
a rule representation with modality “obligation” then is: a small set of regular expressions to establish section
obl(Cnt ∨ CntEx, Cnd) ∧ per(¬(Cnt ∨ CntEx), CndEx) boundaries and extract section numbers from the text.
Section numbering often makes use of several levels (e.g.
The translation of a rule with modality “prohibition” is: 1, 1.1, 1.1.1, etc.), but this is not consistent across doc-
for(Cntfor ∧ ¬CntEx, Cnd) ∧ per(Cntfor ∧ ¬CntEx, CndEx) uments, therefore we only consider top-level sections
in subsequent steps that are sensitive to section bound-
For the modality “permission” the translation is: aries. Besides the inheritance mechanism described in
per(Cnt ∨ CntEx, Cnd) ∧ for(Cnt ∨ CntEx, CndEx) Section 3.2, this decision is crucial for sentence segmen-
tation, the next step in our pipeline, for which we use
Note that this translation commits to formalising, e.g., a customized version of the German sentence splitting
condition exceptions to obligations or prohibitions as model of the stanza11 [13] library. The output of the stan-
additional permissions. Of course this is by no means dard model is postprocessed to undo sentence splits that
the only possible translation: we could have chosen to have been made in error (e.g. those after periods follow-
embed the condition exception explicitly in the condi- ing abbreviations characteristic of legal text) and also
tion of the resulting formula. This choice is due to the those made after colons (:) that separate a predicate from
fact that it facilitates the derivation of a general state- its object(s), such as in the text Für die mit BB4 bezeich-
ment like “Flat roofs should be green roofs” from “Flat neten Grundflächen wird bestimmt: Die Errichtung von
roofs should be green roofs unless they are glass roofs” Gebäuden mit einer maximalen Gebäudehöhe von 8 m ist
in the particular logic used in the prover. In particular, zulässig. ‘For areas marked BB4 it is determined: con-
when checking whether a flat roof in general should be struction of buildings with a maximum building height
a green roof we do not need to explicitly state that none of 8 m is allowed.’. The custom sentence segmentation
of the condition exceptions are satisfied, in line with the step is followed by stanza’s default German pipeline
standard approach in non-monotonic logic and default (de − gsd, stanza model version 1.1.0) for tokenization,
reasoning [30]. part-of-speech (POS) tagging and universal dependency
parsing.
4. Architecture
4.2. Semantic parsing
We describe the system architecture of the full pipeline
The next step in our pipeline is to construct semantic
that takes raw text documents as input, builds semantic
graphs from each sentence. The rule extraction algorithm
graphs using the system described in Section 2.1, ex-
described in Section 3 assumes that all relevant informa-
tracts rules using our method presented in Section 3.2,
tion present in the input text is available in the semantic
maps them to deontic logic formulae as described in Sec-
graph that is the output of the generic semantic parsing
tion 3.3 and provides them as input to the prover (see
pipeline described in Section 2.1. To ensure that this is the
Section 2.2). All components of our system are available
case some minor modifications of the semantic parsing al-
as open-source software8 under an MIT license and the
gorithm were also necessary. First, we introduced a small
end-to-end pipeline is showcased in an online demo9
set of rules in the grammar mapping Universal Depen-
integrating all of them.
8 10
https://github.com/recski/brise-plandok https://gitlab.freedesktop.org/poppler/poppler
9 11
https://ir-group.ec.tuwien.ac.at/brise-extract https://stanfordnlp.github.io/stanza

7
Gabor Recski et al. CEUR Workshop Proceedings 1–13

dency representations to semantic graphs for common to the output in parallel to the construction of the seman-
words expressing negation and modality. The lemmas tic graph, it is this additional information that allows us
nicht and kein trigger the addition of the NEG element to to implement the matching heuristics described in Sec-
the 4lang graph, dürfen ‘may’ and zulässig ‘permitted’ tion 3.2. For parsing of 4lang graphs and generation of
are mapped to PER, untersagen ‘prohibit‘ and unzulässig attribute trees with these IRTGs we use the open-source
‘not permitted’ to FOR, and müssen to OBL. Additionally, alto12 library, which also implements s-graph algebras
the German construction consisting of the particle zu and tree algebras. The alto system also supports proba-
followed by the infinitive form of a verb must also trigger bilistic parsing with weighted grammars, and we rely on
the OBL element, since it can express modality without rule weights to ensure that rules which map subgraphs to
any additional linguistic elements. This latter rule is im- attributes always take precedence over the ‘empty’ rules
plemented by two mechanisms, one that looks for the that are only added to the grammar to ensure that the
lemma zu with the universal part-of-speech tag (UPOS) full graph is derivable. In those few cases when more
PART, the other for the language-specific part-of-speech than one such ‘content’ rule matches the same subgraph,
tag (XPOS) VVIZU marking verbs that contain the par- precedence is given to rules that cover larger substruc-
ticle as an infix (e.g. herzustellen from herstellen ‘create, tures. The trees output by the IRTG parser serve as the
produce’. While even the most rudimentary treatment input to the heuristic construction of rules described in
of the semantics of German modal expressions would the previous section. Finally, rules are converted from the
go beyond the simplicity of such a simple categorization generic (JSON) format to the language of dyadic deontic
(and the scope of this work), in practice this small en- logic, as described in Section 3.3.
hancement of the semantic representation of the input
text was sufficient to allow for the detection of modality 4.4. The prover
by the rule extraction mechanism. Finally we also added
an ad-hoc rule for detecting exceptions: the presence The final step in our pipeline consists of an exemplary rea-
of the word sofern and soweit, both roughly equivalent soning mechanism to draw inferences from the extracted
to the English conjunction ‘provided’ and introducing a rules. This step is based on our adaption13 of the generic
theorem prover deonticProver2.014 which implements
clause that limits the applicability of a previous statement,
backwards proof search in a sequent system for a dyadic
triggers the addition of an element EXC to the semantic deontic logic extended with rules for defeasibly reasoning
graph which is then also available for processing by the from deontic assumptions [21]. Apart from specifying
rule extraction mechanism. the prover to the language obtained from the examples
we needed to further modify it in two ways. First, in order
4.3. Rule extraction to be able to handle attributes with numerical arguments,
such as DachneigungMax for the maximal angle of the
We now describe the implementation details of the two- roof, or with strings as argument, such as Dachart for
step rule extraction method presented in Section 3.2. The the roof type, we extended the prover and the underly-
output of the semantic parser, which serves as the input ing reasoning system to handle atomic propositions with
to rule extraction, is a single directed graph for each in- arguments. In addition, we added ground sequents, i.e.,
put sentence, generated by an Interpreted Regular Tree structures which can be used as leaves in a derivation, cor-
responding to basic properties of measure-like attributes
Grammar from Universal Dependency structures (see
with natural numbers as values: Where msr is a basic
Section 2.1 for details). For recognizing subgraphs and attribute for a measure such as Dachneigung, we con-
mapping them to attributes we also use an IRTG over sidered a triple consisting of the attributes msrGenau(𝑛),
an algebra of s-graphs, this allows us to pipe the output msrMin(𝑛) and msrMax(𝑛), expressing the facts that msr
of the semantic parser directly into our rule extraction is exactly 𝑛, at least 𝑛, or at most 𝑛, respectively. The
grammar. For each 4lang graph we dynamically gener- relations between these three attributes are given by:
ate a unique grammar. The static set of rules encoding • msrGenau(𝑛) → msrMin(𝑛) ∧ msrMax(𝑛)
the correspondence between generic semantic structures
• msrMin(𝑛) → msrMin(𝑚), where 𝑚 ≤ 𝑛
and task-specific attributes is extended with empty termi-
nal rules for each concept in the input graph, this ensures • msrMax(𝑛) → msrMax(𝑚), where 𝑛 ≤ 𝑚
that the entire graph can be constructed by a sequence of • msrMax(𝑛) → ¬msrMin(𝑚), where 𝑛 < 𝑚
operations that is derivable by the underlying RTG and The ground sequents added to the prover then absorb ba-
thus the object can be parsed by the IRTG. The output sic reasoning on these axioms, so that, e.g., the formulae
interpretation of the IRTG is an algebra of trees, whose
leaves are the individual strings that we use to construct ¬(DachneigungGenau(𝑛) ∧ DachneigungGenau(𝑚))
rules in a subsequent step. The trees resemble the order 12
https://github.com/coli-saar/alto
in which these strings (names and values of attributes, 13
https://github.com/blellmann/BRISEprover
modal elements, units of measurement) have been added 14
http://subsell.logic.at/bprover/deonticProver/version2.0/

8
Gabor Recski et al. CEUR Workshop Proceedings 1–13

are derivable for 𝑛 ̸= 𝑚, stating that the exact angle of check once in a preprocessing stage, store for every de-
a roof does not have two different values. ontic assumption the list of conflicting ones, and only
Second, and more significantly, to be more in line with check that none of the assumptions in this list is appli-
other approaches in the area of deontic reasoning such cable and more specific during the actual computation.
as [31] as well as for efficiency reasons we modified the In our experiments this increased efficiency was neces-
mechanism how the prover handles specificity reasoning sary for reasoning with a non-trivial number of deontic
when reasoning from deontic assumptions. To illustrate,
assumptions. To compare the two reasoning methods
assume the deontic assumption
the user can switch between the original (“classic”) and
obl(DachneigungMax(5) ∧ BegruenungDach, modified (“modern”) versions on the web interface15 for
(1) the prover. For the sake of simplicity the web interface
Plangeb(7181))
for the whole pipeline only uses the modified version.
stating that the maximal angle of the roof must be 5 Originally, for derivable input deonticProver2.0 out-
degrees and the roof must be green under the condition puts a pdf file with a derivation in the calculus. How-
that the building is in zone 7181. This would be partially ever, since the derivations can become rather large (even
overruled by the additional more specific assumption breaking the maximal limit on object size in TeX) and the
average user might not be acquainted with the specific
obl(¬BegruenungDach, Plangeb(7181)
(2) formalism used in the prover, we further extended the
∧Planzeichen(BB1)) output module with an option to print the derivation as
stating that roofs in areas of zone 7181 marked with the an explanation in pseudo-natural language. Explanations
label BB1 on the map must be not green roofs. The latter can be unfolded step by step by clicking on a button la-
assumption is considered more specific than (1) because belled “Why?” after the “The statement ... is derivable.”
its condition Plangeb(7181) ∧ Planzeichen(BB1) output. In unfolding the explanation, propositional steps
strictly implies the condition Plangeb(7181) of assump- are skipped by default to reveal the crucial deontic state-
tion (1). In deonticProver2.0 the assumption (1) could ments and assumed facts used there. These intermediary
still be used to infer obligations from the part of its con- steps can additionally be unfolded by clicking on a “Why
tent not in conflict with the content of the more specific does it follow from the above?” button. In the demo the
assumption (2), such as user can select the output format.
obl(¬DachneigungGenau(7), Plangeb(7181)
We stress again that here our prover serves mainly as
(3) an example for a possible reasoning mechanism and that
∧Planzeichen(BB1))
we do not claim that the underlying logic is necessarily
stating that in areas of zone 7181 marked with the label the most appropriate. For this reason we also defer the
BB1 the exact angle of the roof must not be 7 degrees. theoretical details of the modifications of the underlying
In our prover we changed this behaviour so that any sequent system to a forthcoming companion paper.
assumption which is a in conflict with a more specific
applicable one cannot be used to derive any obligations.
Thus (disregarding prohibition and permission opera- 5. Evaluation
tors for the sake of exposition) to check whether (3) is
derivable we now check whether there is an assumption The rule systems presented in Sec. 3.2 were developed
obl(𝐴, 𝐵) such that based on a small annoted sample of sentences from
1. 𝐴 → ¬DachneigungGenau(7) is derivable the zoning plan of the City of Vienna. In order to es-
tablish a representative sample, we started by estimat-
2. Plangeb(7181) ∧ Planzeichen(BB1) → 𝐵 is deriv-
ing the distribution of attributes in the entire corpus
able
by manually labeling the sentences of 10 randomly se-
3. there is no applicable and more specific assumption lected documents with the attributes they mention (ei-
conflicting with obl(𝐴, 𝐵), i.e., there is no obl(𝐶, 𝐷) ther as condition or content). This sample contains 344
such that
mentions of attributes in 193 sentences (as well as 118
a) Plangeb(7181) ∧ Planzeichen(BB1) → 𝐷 sentences without attribute mentions, mostly from the
and 𝐷 → 𝐵 are derivable
preambles). The number of unique attributes in the sam-
b) 𝐶 → ¬𝐴 is derivable
ple is 84, but 193 of the 344 instances (56%) come from
Crucially, the “no-conflict” check in item (3b) above only the 16 most frequent attributes. We then chose 6 sen-
needs to be performed between two assumptions and tences from this sample that together contain mentions
not between the formula to be proved and an assump- of 7 of these 16 attributes, including the 3 most fre-
tion. This means that instead of checking for conflicts quent ones (GebaeudeHoeheMax, AbschlussDachMax,
many times redundantly in the search for a derivation GebaeudeHoeheArt) that are alone responsible for 17%
(as is done in deonticProver2.0) it suffices to perform this 15
http://subsell.logic.at/bprover/briseprover/

9
Gabor Recski et al. CEUR Workshop Proceedings 1–13

of all attribute mentions in the larger sample. We anno- an appropriate translation. Second, our propositional
tated these 6 sentences with the full representation of all language is currently rather restricted, since we do not
rules stated by them and developed our rule extraction permit, e.g., disjunctions in the conditions or content
system to achieve perfect performance on this toy corpus. of an obligation. Again, this could be addressed rather
Both this fully annotated set and the larger sample of straightforwardly by extending the format of our repre-
10 documents annotated for attribute mentions only are sentation, possibly along the lines of JsonLogic17 . We also
released along with the software16 . While our method of do not consider quantification or nested deontic opera-
selecting the sentences for the toy corpus ensures that tors. For the current application these features seemed
the attribute extraction step of our method has high cov- not to be necessary. Most of these limitations are in line
erage (recall above 51% with a precision above 93% on with other current approaches, e.g., [16, 28].
the sample of 10 documents and 344 attribute instances), The proof-of-concept application presented in this pa-
this cannot be considered as quantitative evaluation of per can serve as a blueprint for semantics-based solutions
the full rule extraction pipeline. The limited amount to a wide range of information extraction tasks includ-
of annotated data also does not permit any conclusions ing variants of entity recognition and relation extraction.
about the effect of errors in syntactic parsing made by Such systems are generally more flexible, interpretable,
the stanza model, but our assumption that this should not and less prone to bias than the large neural network mod-
become a bottleneck for such standard text is reinforced els used for similar tasks. However, to make such systems
by the fact that we did not observe any such errors in our a viable alternative for everyday NLP applications, novel
sample. A larger-scale annotation of attribute mentions methods must be devised for the (semi-)automatic learn-
is currently in progress. ing of task-specific rule systems like the one manually
built for this project. Concerning the automated reason-
ing part, we plan to consider specifications to different
6. Discussion frameworks in the future, including those of argumen-
tation theory [27], I/O logic [28], and defeasible deontic
In this article we have presented a system for extract-
logic [16], and integrate existing provers for these for-
ing formal rules from legal text using generic semantic
malisms such as TOAST18 , SPINdle19 or TurnipBox20 . Ad-
parsing and domain-specific pattern-matching, and con-
ditionally, we plan to implement alternative translations
verting them to deontic logic for use in an automated
from the generic representation to the language of dyadic
reasoning system. All components of the pipeline, includ-
deontic logic, corresponding to different interpretations
ing those contributed in this paper, are made available
of the logical structure of deontic statements. Along the
as open-source software under the MIT license, for un-
lines of [35] this could be used to compare such differ-
restricted use in future applications. Unlike machine
ent interpretations. Finally, we would like to investigate
learning based information extraction systems, our rule
whether the part of our pipeline creating general rule
extraction model is fully explainable and serves as an
representations could be used in combination with the
example for a specific application of semantic parsing
NAI suite [17]. Our rule-based approach could be used as
to domain-specific information extraction. While the se-
a first step to automatically suggest a formalisation of a
mantic representation and parsing algorithms used in
given legal text, which then could be converted into the
our pipeline are language-agnostic, they may require
format used in the NAI suite and run through the quality
adaptation to new languages and domains. Furthermore,
assurance function provided there. The benefit would be
for domains and text genres that more closely resemble
that the legal experts do not need to actively formulate
everyday language use, deep semantic analysis would
the formalisation of a legal text, but only to potentially
require lexical inference, a notoriously difficult task in
adjust it based on the quality assurance checks.
computational semantics [32, 33]. In our general rule
representation we concentrated on deontic statements of
a reasonably simple form. While this form seems to be Acknowledgments
well adapted to the regulations provided in the texts for
the zoning maps of Vienna, there are some obvious limi- We are grateful to the three anonymous reviewers for
tations. First, since we always assume the presence of a their suggestions and for additional references. Work
deontic modality (obligation, prohibition or permission), supported by BRISE-Vienna (UIA04-081), a European
at the moment we cannot treat constitutive norms [34] Union Urban Innovative Actions project.
such as “The area marked on the map with the label
BB1 is designated a residential area”. This issue could
be addressed by adding an additional modality “consitu- 17
https://jsonlogic.com
tiveNorm” to the general representation together with 18
http://toast.arg-tech.org/
19
http://spindle.data61.csiro.au/spindle/
16 20
https://github.com/recski/brise-plandok https://turnipbox.netlify.app

10
Gabor Recski et al. CEUR Workshop Proceedings 1–13

References sociation for Computational Linguistics, Online,
2020, pp. 40–52. URL: https://www.aclweb.org/
[1] L. Banarescu, C. Bonial, S. Cai, M. Georgescu, anthology/2020.conll-shared.4. doi:10.18653/v1/
K. Griffitt, U. Hermjakob, K. Knight, P. Koehn, 2020.conll-shared.4.
M. Palmer, N. Schneider, Abstract Meaning Rep- [8] D. Samuel, M. Straka, ÚFAL at MRP 2020:
resentation for sembanking, in: Proceedings of Permutation-invariant semantic parsing in PERIN,
the 7th Linguistic Annotation Workshop and Inter- in: Proceedings of the CoNLL 2020 Shared Task:
operability with Discourse, Association for Com- Cross-Framework Meaning Representation Pars-
putational Linguistics, Sofia, Bulgaria, 2013, pp. ing, Association for Computational Linguistics, On-
178–186. URL: https://www.aclweb.org/anthology/ line, 2020, pp. 53–64. URL: https://www.aclweb.org/
W13-2322. anthology/2020.conll-shared.5. doi:10.18653/v1/
[2] C. Lyu, I. Titov, AMR parsing as graph pre- 2020.conll-shared.5.
diction with latent alignment, in: Proceedings [9] A. Kornai, The algebra of lexical semantics, in:
of the 56th Annual Meeting of the Association C. Ebert, G. Jäger, J. Michaelis (Eds.), Proceedings of
for Computational Linguistics (Volume 1: Long the 11th Mathematics of Language Workshop, LNAI
Papers), Association for Computational Linguis- 6149, Springer, 2010, pp. 174–199. doi:10.5555/
tics, Melbourne, Australia, 2018, pp. 397–407. 1886644.1886658.
URL: https://www.aclweb.org/anthology/P18-1037. [10] G. Recski, Building concept definitions from ex-
doi:10.18653/v1/P18-1037. planatory dictionaries, International Journal of Lex-
[3] S. Zhang, X. Ma, K. Duh, B. Van Durme, AMR icography 31 (2018) 274–311. doi:10.1093/ijl/
parsing as sequence-to-graph transduction, in: Pro- ecx007.
ceedings of the 57th Annual Meeting of the Associa- [11] Á. Kovács, K. Gémes, A. Kornai, G. Recski,
tion for Computational Linguistics, Association for BMEAUT at SemEval-2020 task 2: Lexical en-
Computational Linguistics, Florence, Italy, 2019, pp. tailment with semantic graphs, in: Proceed-
80–94. URL: https://www.aclweb.org/anthology/ ings of the Fourteenth Workshop on Semantic
P19-1009. doi:10.18653/v1/P19-1009. Evaluation, International Committee for Compu-
[4] O. Abend, A. Rappoport, Universal Conceptual tational Linguistics, Barcelona (online), 2020, pp.
Cognitive Annotation (UCCA), in: Proceedings 135–141. URL: https://www.aclweb.org/anthology/
of the 51st Annual Meeting of the Association 2020.semeval-1.15.
for Computational Linguistics (Volume 1: Long [12] J. Nivre, M. Abrams, Ž. Agić, L. Ahrenberg, L. An-
Papers), Association for Computational Linguis- tonsen, K. Aplonova, M. J. Aranzabe, G. Arutie,
tics, Sofia, Bulgaria, 2013, pp. 228–238. URL: https: M. Asahara, L. Ateyah, M. Attia, A. Atutxa, L. Au-
//www.aclweb.org/anthology/P13-1023. gustinus, E. Badmaeva, M. Ballesteros, E. Baner-
[5] D. Hershcovich, O. Abend, A. Rappoport, A jee, S. Bank, V. Barbu Mititelu, V. Basmov, J. Bauer,
transition-based directed acyclic graph parser for S. Bellato, K. Bengoetxea, Y. Berzak, I. A. Bhat,
UCCA, in: Proceedings of the 55th Annual R. A. Bhat, E. Biagetti, E. Bick, R. Blokland, V. Bo-
Meeting of the Association for Computational bicev, C. Börstell, C. Bosco, G. Bouma, S. Bow-
Linguistics (Volume 1: Long Papers), Associa- man, A. Boyd, A. Burchardt, M. Candito, B. Caron,
tion for Computational Linguistics, Vancouver, G. Caron, G. Cebiroğlu Eryiğit, F. M. Cecchini,
Canada, 2017, pp. 1127–1138. URL: https://www. G. G. A. Celano, S. Čéplö, S. Cetin, F. Chalub, J. Choi,
aclweb.org/anthology/P17-1104. doi:10.18653/ Y. Cho, J. Chun, S. Cinková, A. Collomb, Ç. Çöl-
v1/P17-1104. tekin, M. Connor, M. Courtin, E. Davidson, M.-
[6] D. Hershcovich, O. Abend, A. Rappoport, Multi- C. de Marneffe, V. de Paiva, A. Diaz de Ilarraza,
task parsing across semantic representations, in: C. Dickerson, P. Dirix, K. Dobrovoljc, T. Dozat,
Proceedings of the 56th Annual Meeting of the As- K. Droganova, P. Dwivedi, M. Eli, A. Elkahky,
sociation for Computational Linguistics (Volume 1: B. Ephrem, T. Erjavec, A. Etienne, R. Farkas,
Long Papers), Association for Computational Lin- H. Fernandez Alcalde, J. Foster, C. Freitas, K. Gaj-
guistics, Melbourne, Australia, 2018, pp. 373–385. došová, D. Galbraith, M. Garcia, M. Gärdenfors,
URL: https://www.aclweb.org/anthology/P18-1035. S. Garza, K. Gerdes, F. Ginter, I. Goenaga, K. Go-
doi:10.18653/v1/P18-1035. jenola, M. Gökırmak, Y. Goldberg, X. Gómez Guino-
[7] H. Ozaki, G. Morio, Y. Koreeda, T. Morishita, vart, B. Gonzáles Saavedra, M. Grioni, N. Grūzı̄tis,
T. Miyoshi, Hitachi at MRP 2020: Text- B. Guillaume, C. Guillot-Barbance, N. Habash, J. Ha-
to-graph-notation transducer, in: Proceed- jič, J. Hajič jr., L. Hà Mỹ, N.-R. Han, K. Harris,
ings of the CoNLL 2020 Shared Task: Cross- D. Haug, B. Hladká, J. Hlaváčová, F. Hociung,
Framework Meaning Representation Parsing, As- P. Hohle, J. Hwang, R. Ion, E. Irimia, O.. Ishola,

11
Gabor Recski et al. CEUR Workshop Proceedings 1–13

T. Jelínek, A. Johannsen, F. Jørgensen, H. Kaşıkara, ings of the 58th Annual Meeting of the Associa-
S. Kahane, H. Kanayama, J. Kanerva, B. Katz, tion for Computational Linguistics: System Demon-
T. Kayadelen, J. Kenney, V. Kettnerová, J. Kirchner, strations, Association for Computational Linguis-
K. Kopacewicz, N. Kotsyba, S. Krek, S. Kwak, V. Laip- tics, Online, 2020, pp. 101–108. URL: https://www.
pala, L. Lambertino, L. Lam, T. Lando, S. D. Larasati, aclweb.org/anthology/2020.acl-demos.14. doi:10.
A. Lavrentiev, J. Lee, P. Lê Hồng, A. Lenci, S. Lertpra- 18653/v1/2020.acl-demos.14.
dit, H. Leung, C. Y. Li, J. Li, K. Li, K. Lim, N. Ljubešić, [14] A. Koller, Semantic construction with graph gram-
O. Loginova, O. Lyashevskaya, T. Lynn, V. Macke- mars, in: Proceedings of the 11th International
tanz, A. Makazhanov, M. Mandl, C. Manning, R. Ma- Conference on Computational Semantics, Associ-
nurung, C. Mărănduc, D. Mareček, K. Marheinecke, ation for Computational Linguistics, London, UK,
H. Martínez Alonso, A. Martins, J. Mašek, Y. Mat- 2015, pp. 228–238. URL: https://www.aclweb.org/
sumoto, R. McDonald, G. Mendonça, N. Miekka, anthology/W15-0127.
M. Misirpashayeva, A. Missilä, C. Mititelu, Y. Miyao, [15] M. Sergot, F. Sadri, R. Kowalski, F. Kriwaczek,
S. Montemagni, A. More, L. Moreno Romero, K. S. P. Hammond, H. Cory, The British Nationality Act
Mori, S. Mori, B. Mortensen, B. Moskalevskyi, as a logic program, Communications of the ACM
K. Muischnek, Y. Murawaki, K. Müürisep, P. Nain- 29 (1986).
wani, J. I. Navarro Horñiacek, A. Nedoluzhko, [16] G. Governatori, Practical normative reasoning
G. Nešpore-Bērzkalne, L. Nguyễn Thi., H. Nguyễn with defeasible deontic logic, in: C. d’Amato,
Thi. Minh, V. Nikolaev, R. Nitisaroj, H. Nurmi, M. Theobald (Eds.), Reasoning Web 2018, volume
S. Ojala, M. Olúòkun, Adédayo.and Omura, P. Osen- 11078 of LNCS, Springer, 2018, pp. 1–25.
ova, R. Östling, L. Øvrelid, N. Partanen, E. Pas- [17] T. Libal, A. Steen, NAI: Towards transparent and
cual, M. Passarotti, A. Patejuk, G. Paulino- usable semi-automated legal analysis, in: IRIS 2020,
Passos, S. Peng, C.-A. Perez, G. Perrier, S. Petrov, Editions Weblaw, 2020, pp. 265–272.
J. Piitulainen, E. Pitler, B. Plank, T. Poibeau, [18] S. Batsakis, G. Baryannis, G. Governatori, I. Tach-
M. Popel, L. Pretkalnin, a, S. Prévost, P. Proko- mazidis, G. Antoniou, Legal representation and
pidis, A. Przepiórkowski, T. Puolakainen, S. Pyysalo, reasoning in practice: A critical comparison, in:
A. Rääbis, A. Rademaker, L. Ramasamy, T. Rama, JURIX 2018, IOS Press, 2018, pp. 31–40. URL: https:
C. Ramisch, V. Ravishankar, L. Real, S. Reddy, //doi.org/10.3233/978-1-61499-935-5-31.
G. Rehm, M. Rießler, L. Rinaldi, L. Rituma, L. Rocha, [19] R. Calegari, G. Contissa, F. Lagioia, A. Omicini,
M. Romanenko, R. Rosa, D. Rovati, V. Ros, ca, G. Sartor, Defeasible systems in legal reasoning: A
O. Rudina, J. Rueter, S. Sadde, B. Sagot, S. Saleh, comparative assessment, in: JURIX 2019, IOS Press,
T. Samardžić, S. Samson, M. Sanguinetti, B. Saulı̄te, 2019, pp. 169–174. doi:10.3233/FAIA190320.
Y. Sawanakunanon, N. Schneider, S. Schuster, [20] T. Libal, A meta-level annotation language
D. Seddah, W. Seeker, M. Seraji, M. Shen, A. Shi- for legal texts, in: M. Dastani, H. Dong,
mada, M. Shohibussirri, D. Sichinava, N. Silveira, L. van der Torre (Eds.), CLAR 2020, volume 12061 of
M. Simi, R. Simionescu, K. Simkó, M. Šimková, LNCS, Springer, 2020, pp. 131–150. doi:10.1007/
K. Simov, A. Smith, I. Soares-Bastos, C. Spadine, 978-3-030-44638-3_9.
A. Stella, M. Straka, J. Strnadová, A. Suhr, U. Su- [21] A. Ciabattoni, B. Lellmann, Sequent rules for rea-
lubacak, Z. Szántó, D. Taji, Y. Takahashi, T. Tanaka, soning and conflict resolution in conditional norms,
I. Tellier, T. Trosterud, A. Trukhina, R. Tsarfaty, in: F. Liu, A. Marra, P. Portner, F. V. D. Putte (Eds.),
F. Tyers, S. Uematsu, Z. Urešová, L. Uria, H. Uszko- DEON 2020/2021, College Publications, 2021.
reit, S. Vajjala, D. van Niekerk, G. van Noord, [22] D. Merigoux, N. Chataing, J. Protzenko, Catala:
V. Varga, E. Villemonte de la Clergerie, V. Vincze, A programming language for the law, CoRR
L. Wallin, J. X. Wang, J. N. Washington, S. Williams, abs/2103.03198 (2021). URL: https://arxiv.org/abs/
M. Wirén, T. Woldemariam, T.-s. Wong, C. Yan, 2103.03198. arXiv:2103.03198.
M. M. Yavrumyan, Z. Yu, Z. Žabokrtský, A. Zeldes, [23] N. O. Nawari, A generalized adaptive frame-
D. Zeman, M. Zhang, H. Zhu, Universal depen- work (GAF) for automating code compliance
dencies 2.3, 2018. URL: http://hdl.handle.net/11234/ checking, Buildings 9 (2019). URL: https://
1-2895, LINDAT/CLARIN digital library at the In- www.mdpi.com/2075-5309/9/4/86. doi:10.3390/
stitute of Formal and Applied Linguistics (ÚFAL), buildings9040086.
Faculty of Mathematics and Physics, Charles Uni- [24] A. Sleimi, N. Sannier, M. Sabetzadeh, L. C. Briand,
versity. M. Ceci, J. Dann, An automated framework for the
[13] P. Qi, Y. Zhang, Y. Zhang, J. Bolton, C. D. Man- extraction of semantic legal metadata from legal
ning, Stanza: A python natural language processing texts, Empirical Software Engineering 26 (2021) 43.
toolkit for many human languages, in: Proceed- doi:10.1007/s10664-020-09933-5.

12
Gabor Recski et al. CEUR Workshop Proceedings 1–13

[25] J. Morris, Blawx: Rules as code demonstration, MIT
Computational Law Report (2020). URL: https://law.
mit.edu/pub/blawxrulesascodedemonstration.
[26] J. Zhang, N. M. El-Gohary, Automated information
transformation for automated regulatory compli-
ance checking in construction, Journal of Com-
puting in Civil Engineering 29 (2015) B4015001.
doi:10.1061/(ASCE)CP.1943-5487.0000427.
[27] S. Modgil, H. Prakken, The ASPIC+ framework for
structured argumentation: A tutorial, Argument
and Computation 5 (2014) 31–62. URL: http://dx.doi.
org/10.1080/19462166.2013.869766.
[28] X. Parent, L. van der Torre, Input/output logic, in:
D. Gabbay, J. Horty, X. Parent, R. van der Meyden,
L. van der Torre (Eds.), Handbook of Deontic Logic
and Normative Systems, College Publications, 2013,
pp. 495–544.
[29] A. Koller, M. Kuhlmann, A generalized view on
parsing and translation, in: Proceedings of the 12th
International Conference on Parsing Technologies,
Association for Computational Linguistics, Dublin,
Ireland, 2011, pp. 2–13. URL: https://www.aclweb.
org/anthology/W11-2902.
[30] R. Reiter, A logic for default reasoning, Artificial
Intelligence (1980).
[31] J. F. Horty, Reasons as Defaults, Oxford University
Press, 2012.
[32] A. Talman, S. Chatzikyriakidis, Testing the gener-
alization power of neural network models across
NLI benchmarks, in: Proceedings of the 2019
ACL Workshop BlackboxNLP: Analyzing and Inter-
preting Neural Networks for NLP, Association for
Computational Linguistics, Florence, Italy, 2019, pp.
85–94. URL: https://www.aclweb.org/anthology/
W19-4810. doi:10.18653/v1/W19-4810.
[33] M. Schmitt, H. Schütze, Language models for lexical
inference in context, in: Proceedings of the 16th
Conference of the European Chapter of the Associ-
ation for Computational Linguistics: Main Volume,
Association for Computational Linguistics, Online,
2021, pp. 1267–1280. URL: https://www.aclweb.org/
anthology/2021.eacl-main.108.
[34] G. Boella, L. W. N. van der Torre, Regulative and
constitutive norms in normative multiagent sys-
tems, in: D. Dubois, C. A. Welty, M. Williams (Eds.),
KR2004, AAAI Press, 2004, pp. 255–266. URL: http:
//www.aaai.org/Library/KR/2004/kr04-028.php.
[35] B. Lellmann, F. Gulisano, A. Ciabattoni,
Mı̄mām . sā deontic reasoning using speci-
ficity: A proof theoretic approach, Artificial
Intelligence and Law (published online 2020).
doi:10.1007/s10506-020-09278-w.