=Paper=
{{Paper
|id=Vol-2888/paper3
|storemode=property
|title=Explainable Rule Extraction via Semantic Graphs
|pdfUrl=https://ceur-ws.org/Vol-2888/paper3.pdf
|volume=Vol-2888
|authors=Gabor Recski,Björn Lellmann,Adam Kovacs,Allan Hanbury
|dblpUrl=https://dblp.org/rec/conf/icail/RecskiLKH21
}}
==Explainable Rule Extraction via Semantic Graphs==
Explainable Rule Extraction via Semantic Graphs Gabor Recski1 , Björn Lellmann2,3 , Adam Kovacs1,4 and Allan Hanbury1 1 TU Wien, Vienna, Austria 2 SBA Research, Vienna, Austria 3 Federal Ministry for Digital and Economic Affairs, Vienna, Austria (Since May 3, 2021) 4 Budapest University of Technology and Economics, Budapest, Hungary Abstract We present an end-to-end system for extracting deontic logic formulae from legal text using a generic semantic parsing module and task-specific graph grammars, and for performing automated reasoning on the extracted formulae. The pipeline enables automated compliance checking and is applied to text documents of the zoning map of the city of Vienna. All components are released as open-source software, the full pipeline is showcased in an online demo. Keywords semantic parsing, information extraction, automated reasoning, deontic logic 1. Introduction entirely rule-based, making our system an example of true explainable AI (XAI). Unlike in deep learning-based We present an end-to-end system for extracting deontic information extraction systems, extracted rules can be logic formulae from legal text using a generic semantic directly traced back to text patterns, making it straight- parsing module and task-specific graph grammars, and forward to provide natural language explanations for for performing automated reasoning on the extracted decisions made based on them. This explainable nature formulae. An overview of the pipeline is shown in Fig. 1. also enables human-in-the-loop operation and provides Plain text regulations are processed by a pipeline of safeguards against biased decision-making. Our main domain-agnostic language processing tools, including a contributions are: system for building syntax-independent concept graphs that represent the meaning of each sentence. These • Specification of a formal representation of deontic graphs serve as the input for a task-specific rule extrac- statements including those of the construction tion module that maps them to deontic logic formulae, regulation domain for automated rule extraction which in turn are used in an automated reasoning system. and reasoning The proposed pipeline is applied to text documents of the • A preprocessed, structured corpus of sentences zoning map of the city of Vienna1 , an exciting corpus of extracted from the zoning plan of the City of Vi- legal regulations whose highly structured nature renders enna, a small subset of which is annotated manu- it very well suited for formal approaches. In absence of a ally with formal rule representations large-scale annotated corpus we evaluate our approach on a toy dataset of manually analyzed sentences that • A grammar-based system for explainable rule ex- were selected to cover the most frequent attributes in traction from semantic graphs, evaluated on the the full dataset. Possible applications include automated annotated corpus compliance checking and question answering. The se- mantic parser and rule extractor components are both • The adaption and extension of a general theorem prover to the reasoning domain, including natural Proceedings of the Fifth Workshop on Automated Semantic Analysis language output of Information in Legal Text (ASAIL 2021), June 25, 2021, São Paulo, Brazil. • System architecture and working prototype for " gabor.recski@tuwien.ac.at (G. Recski); lellmann@logic.at an end-to-end system for rule extraction and au- (B. Lellmann); adam.kovacs@tuwien.ac.at (A. Kovacs); allan.hanbury@tuwien.ac.at (A. Hanbury) tomated reasoning from raw text 0000-0001-5551-3100 (G. Recski); 0000-0002-5335-1838 (B. Lellmann); 0000-0001-6132-7144 (A. Kovacs); The paper is structured as follows. In Sec. 2 we re- 0000-0002-7149-5843 (A. Hanbury) view recent work on semantic parsing, automatic rule © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). extraction, and automated reasoning for legal-tech ap- CEUR CEUR Workshop Proceedings (CEUR-WS.org) Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 plications, and present the dependencies of our pipeline: 1 See https://www.wien.gv.at/flaechenwidmung/public/ a task-independent semantic parser and a deontic logic for the map and https://www.data.gv.at/katalog/dataset/ flachenwidmungs-und-bebauungsplan-plandokumente-wien prover. Sec. 3 presents the rule extraction method, Sec. 4 for how to obtain the text documents (in German). describes the architecture of the full pipeline. Sec. 5 1 Gabor Recski et al. CEUR Workshop Proceedings 1–13 PDF documents Text segmentation Sentences Semantic parsing Concept graphs Rule extraction Deontic rules Reasoning Figure 1: Overview of our pipeline. The solid rectangle indicates the newly contributed component and format, dashed rectangles mark existing tools that we modify or extend. provides preliminary evaluation of our rule extraction representations in NLP pipelines. Abstract Meaning Rep- method, Sec. 6 discusses next steps and draws some con- resentations (AMRs) [1] represent sentence meaning as clusions. All components of our system are available directed graphs of words, but do not provide a model as open-source software2 . The example application is of word meaning and are highly English-specific, not showcased in an online demo3 . intended as a language-independent framework of mean- ing representation. The task of parsing raw text to AMR graphs has recently attracted growing interest and is usu- 2. Related work ally performed using deep neural networks [2, 3] trained on annotated corpora, also called sembanks. Universal The related work for the two main components of the sys- Conceptual Cognitive Annotation (UCCA) [4] takes a tem is presented: (i) semantic parsing for extracting logic language-agnostic approach to meaning representation, expressions directly from legal text, and (ii) deontic logic modeling sentence meaning with directed acyclic graphs for performing automated reasoning on the extracted (DAGs) representing scenes evoked by predicates. Top formulae. UCCA parsers also rely on manually annotated corpora and neural networks [5, 6, 7, 8]. Since these frameworks 2.1. Semantic parsing do not provide a generic parsing algorithm, building Semantic parsing is the task of automatically mapping representations for a new language and/or new domain natural language text to a formal representation of its would require the manual compilation of large annotated meaning. Most contemporary architectures for solving datasets that could be used to train end-to-end machine information extraction tasks do not perform semantic learning models. For the pipeline presented in this paper parsing, instead relying on models which directly en- we choose the language-independent 4lang framework code the correspondence between natural language text [9], for which a robust parsing method [10] with an open- and some set of task-specific structures such as labels, source implementation4 [11] is also available. sequences, attribute-value structures, etc. These mod- The 4lang framework represents the meaning of both els are primarily built using machine learning methods, words and larger units like phrases and sentences as di- whose performance is dependent on the quality and quan- rected graphs of concepts. The representation is syntax- tity of available training data and whose decisions are independent, concepts do not have types such as part-of- difficult to interpret and prone to bias. Rule-based mod- speech or even as predicate and argument. A key feature els, on the other hand, require considerable expert effort of 4lang graphs that enables uniform treatment of syn- 0 to build and maintain and can still be difficult to adapt tactically different constructions is the 0-relation (−→), a to changes in the task definition. The architecture we single representation for a range of closely related seman- 0 propose tackles the information extraction task in two tic relationships such as the ISA relationship (e.g. roof − → steps: the mapping of natural language text to task- and covering), attribution (e.g. sidewalk − 0 → paved), and pred- domain-independent meaning representations (semantic ication (e.g. platform − 0 → extend). 4lang graphs can be parsing) followed by a task-specific information extrac- built from raw text automatically using a rule-based sys- tion step that operates on these representations. In this tem that uses Universal Dependencies (UD) [12] as an section we give a brief overview of common approaches intermediate step. UD trees encode grammatical relations to semantic parsing and of the representation framework (dependencies) between pairs of words in a sentence — an used by our pipeline. example of such an analysis is shown in Fig. 2, described Unlike other forms of automated linguistic annotation later. UD parsers are available for dozens of languages, such as part-of-speech tagging or syntactic parsing, se- in this pipeline we use the stanza package5 [13]. The mantic parsing is not in any way standardized in the transformation of UD trees into 4lang graphs is based natural language processing (NLP) community. Even on a small set of rules described in [10] and implemented those few frameworks that have recently attracted grow- by [11] as parsing and decoding of an Interpreted Regu- ing interest in the task are rarely used as intermediate lar Tree Grammar (IRTG) [14], a formalism that we also use in this work for implementing the rule extraction 2 4 https://github.com/recski/brise-plandok https://github.com/adaamko/wikt2def 3 5 https://ir-group.ec.tuwien.ac.at/brise-extract https://stanfordnlp.github.io/stanza 2 Gabor Recski et al. CEUR Workshop Proceedings 1–13 mechanism that maps 4lang semantic graphs to trees of minimal height of 10 metres. Due to this property the rea- attributes (see Sec. 3.2). Most rules map a single UD edge soning process is non-monotone: While from the single between two content words to 4lang edges connecting assumption obl(buildingHeightMin(10), ⊤) we can the corresponding concepts (e.g., in the example in Fig. 2, derive the formula the relations amod and nsubj : pass are mapped to a obl(¬buildingHeightExactly(8), facingStreet), 0-edge and a 2-edge, respectively). stating that the height of buildings facing the street 2.2. Automated reasoning and deontic must not be exactly 8 metres, we cannot derive logic the same formula with the additional assumption obl(buildingHeightMax(9), facingStreet) any- The investigation of automated reasoning methods in the more. As a second modification from the original legal domain has a long history, see, e.g., the seminal [15]. deonticProver2.0, here we consider a different conflict More recently, automated reasoning methods have been resolution mechanism, resulting in particular in higher considered for legal texts or company regulations [16, 17], efficiency of the implementation. Reasoning in this and a large number of reasoning frameworks and tools logic is implemented via backwards proof search in a are available, see, e.g., [18, 19] for an overview. Follow- sequent system with underivability statements, both in ing the approach in [17], here we consider a general and the system from [21] and in a slight modified reasoning formalism-independent representation of the regulations, engine. See op. cit. for the details of the original system which can be translated into different frameworks. For- and Sec. 4.4 for the modifications. tunately, the structure of the intended application, the We are thankful to one of the reviewers for bringing ad- regulations of the zoning map of Vienna, is relatively ditional relevant literature to our attention [22, 23, 24, 25, clear, and mostly does not require advanced features of 26]. Unfortunately, space and time constraints prevented the formal language such as nested deontic operators in a detailed comparison for the final version. the assumptions [16] or macros [20]. The specific reasoning engine used in this paper, the theorem prover BRISEprover6 , is an extension and modi- 3. Rule extraction fication of the theorem prover deonticProver2.07 devel- The pipeline presented here takes as its input raw text oped in [21] for reasoning with assumptions in dyadic de- documents containing regulations of the zoning map of ontic logic. In this logical framework, propositional logic the City of Vienna, builds representations of their mean- is extended with dyadic deontic operators obl, for and per. ing using the 4lang system (see Sec. 2.1), uses the re- Formulae obl(𝐴, 𝐵), for(𝐴, 𝐵) and per(𝐴, 𝐵) are read sulting semantic graphs to extract the legal content of as “it is obligatory that 𝐴 given 𝐵”, “it is forbidden that regulations and makes them available to the prover (see 𝐴 given 𝐵” and “it is permitted that 𝐴 given 𝐵”, respec- Sec. 2.2), which verifies whether some statement is deriv- tively. As the first extension considered here we extend able given a set of assumptions. The full architecture is the language with predicate symbols to capture proper- described in Sec. 4, we now present the novel rule extrac- ties, e.g., “building height at most 9 metres”, in atomic for- tion component and its interfaces to semantic parsing mulae, e.g., buildingHeightMax(9). Note that it would and automated reasoning. be straightforward to include an additional argument rep- resenting the subject, i.e., which building has a height of at most 9 metres. Since in our case this is clear from the 3.1. Representation context we simplify the representation by assuming the Since we do not want to commit to modeling regu- subject is always the same. The prover decides derivabil- lations in a particular formalism, and to facilitate the ity from a set of factual and deontic assumptions, i.e., non- integration of different reasoning engines we first con- deontic and non-nested deontic formulae respectively. vert the legal content of the regulations into a generic The reasoning engine supports the specificity principle representation. For this we assume that a deontic regula- in the form that more specific deontic assumptions over- tion, i.e., a regulation stating an obligation, prohibition ride less specific conflicting ones. E.g., the assumption or permission, is comprised of the following parts: obl(buildingHeightMax(9), facingStreet), stating that buildings facing the street must have a maximal • Modality: This states whether the regulation is height of 9 metres, overrides the less specific assump- an obligation, a prohibition or a permission; tion obl(buildingHeightMin(10), ⊤), stating that un- • Content: The content of the regulation, i.e., what der the always true condition ⊤ buildings must have a is obligatory / prohibited / permitted; 6 See http://subsell.logic.at/bprover/briseprover/ • Conditions: The conditions of the regulation stat- 7 See http://subsell.logic.at/bprover/deonticProver/version2.0/ ing when the regulation applies; 3 Gabor Recski et al. CEUR Workshop Proceedings 1–13 nsubj:pass Dachneigung aux:pass obl case nmod nmod case case case det det nummod det amod mark NOUN ADP ADP DET NOUN ADP NUM NOUN AUX ADP DET NOUN DET ADJ NOUN PART VERB Flachdächer bis zu einer Dachneigung von fünf Grad sind entsprechend dem Stand der technischen Wissenschaften zu begrünen {"modality": "obligation", "attributes": [ {"type": "content", "name": "BegruenungDach", "value": null}, {"type": "condition", "name": "Dachart", "value": "Flachdach"}, {"type": "condition", "name": "DachneigungMax", "value": "5Grad"}} Figure 2: Universal Dependency analysis, 4lang semantic graph, and formal rule representation for the sentence Flachdächer bis zu einer Dachneigung von fünf Grad sind entsprechend dem Stand der technischen Wissenschaften zu begrünen. ‘Flat roofs with a pitch not exceeding 5 degrees must be greened using state of the art technologies.’ • ConditionExceptions: Possible exceptions to the As an example, the generic representation of the de- conditions, stating when the regulation does not ontic regulation “Flat roofs should be green roofs unless apply. E.g., in the regulation “Flat roofs should they are glass roofs” is given by: be green roofs, unless they are glass roofs”, the { ”modality” : ”obligation”, “glass roofs” is an exception to the condition. ”attributes” : [ { ”name” : ”roofType”, ”value” : ”flatRoof”, • ContentExceptions: Possible exceptions to the ”type” : ”condition”}, content. E.g., in the regulation “Windows are { ”name” : ”greenRoof”, ”value” : NIL, prohibited except for portholes” the “portholes” ”type” : ”content”}, are an exception to the content. { ”name” : ”roofType”, ”value” : ”glassRoof”, This level of granularity seems to capture the necessary ”type” : ”conditionException”}]} details of the sentences found in the documents from the Here we modelled both flat roofs and glass roofs as roof city of Vienna zoning map while being flexible enough to types, while modelling the property of being a green permit translation into different frameworks like dyadic roof as an atomic propositional statement because the deontic logic, defeasible deontic logic [16], argumenta- tion based approaches [27] or input output logic [28]. latter in the documents corresponds to a more complex Concretely, we represent this structure as a JSON object proposition. Note that in contrast to, e.g., the approach in [17] at {”modality” : Modality, ”attributes” : List} this stage we do not commit to a particular modelling of exceptions by negation-as-failure or negated conditions. where the key “modality” takes one of the values “obli- This retains the flexibility of the general format necessary gation”, “prohibition”, “permission”, and where List is for subsequent specification into a large number of dif- an array containing attributes of the following form: ferent formalisms. There are of course some limitations {”name” : Name, ”value” : Value, ”type” : Type}. inherent in this representation, in particular we do not model negation. See Sec. 6 for a more detailed discussion. Here Name is the name of the attribute, Value is its value, and Type is one of “Content”, “Condition”, “ConditionEx- 3.2. Extraction ception”, “ContentException”. We obtained the attribute names via collaboration with domain experts from the The mapping from semantic graphs to the rules described City of Vienna Baupolizei and verified that the structure is implemented in two steps. An IRTG grammar similar is appropriate by manually annotating several hundred to that of the semantic parsing system (see Sec. 2.1) is sentences from the documents of the zoning map. used to extract all attributes, values, and expressions of 4 Gabor Recski et al. CEUR Workshop Proceedings 1–13 modality that occur within a single sentence. A simple heuristic then matches these elements with each other in order to create the generic rule representations described in the previous section. The mapping we establish is between patterns in generic semantic graphs and formal rules. This two-step approach is an arbitrary simplifica- tion that makes implementation simpler and more flexi- ⇕ ble. We shall now describe the approach using some ex- amples and point out some of its current limitations. The first component of our rule extraction system is an IRTG OBL grammar mapping 4lang graphs to lists of strings rep- resenting attribute names (e.g. DachneigungMax ‘max- BegruenungDach imal roof pitch’), modalities (e.g. OBL ‘obligation’), as well as numbers and units of measurement that may be interpreted as attribute values (e.g. 5 and m). We shall Dachart v_Flachdach DachneigungMax refer to this grammar as fl_to_attr. The heuristics for matching values and modalities to attribute names, q_Grad v_5 described later in this section, can only disambiguate Figure 3: Example mapping implemented by the between multiple solutions if it is informed about the fl_to_attr grammar for the sentence in Figure 2. positions of patterns relative to each other. Hence we use a Tree Grammar as the output interpretation, which E -> a_gebaeudehoehe_maximal(E, E) [100] allows us to represent these strings as leaves of a tree [fl] f_src(f_tgt(merge(r_src(?1), merge(r_tgt(?2), that resembles the order in which they were recognized, "(u/ Gebaeudehoehe :0 (v / maximal))")))) corresponding to steps of composing the 4lang graph [attr] *(*(?1, ?2), *("OBL", "GebaeudeHoeheMax")) from subgraphs. An example of this mapping is presented in Fig. 3. Each IRTG rule encoding the correspondence Figure 4: Example rule of the fl_to_attr IRTG grammar. between a 4lang subgraph and an attribute or tree of The first interpretation line is wrapped for readability, the op- attributes is a mapping between rule applications in an erations are explained in the text. s-graph algebra [29] and a tree algebra. S-graphs are graphs whose nodes may be marked by special labels called sources and the s-graph algebra’s core operation tation by heuristically matching modalities as well as po- merge creates new s-graphs by taking the union of its tential attribute values to attribute names. We illustrate arguments but merging nodes that have the same source. this process using an example with multiple attributes; We now illustrate this mechanism with a simple exam- consider the following sentence fragment, a subordinate ple, for a more detailed introduction to s-graph algebras clause of a longer regulation: bei einer Straßenbreite ab 10 and their application to semantic parsing the reader is m entlang der Fluchtlinien Gehsteige mit einer Breite von referred to [29]. The IRTG rule presented in Fig. 4 de- mindestens 2,0 m herzustellen sind. ‘in case of a street fines two binary operations, to be performed in parallel width of 10 m or more, sidewalks with a width of at least on corresponding pairs of s-graphs and trees. The op- 2.0 m are to be constructed along the alignment lines.’. eration of the 4lang interpretation merges two graphs The pipeline described so far will extract from this sen- along their root nodes with the two nodes of the edge tence three attribute names (StrassenbreiteMin ‘min- Gebäudehöhe − 0 → maximal to create a single graph, while imum street width’, GehsteigbreiteMin ‘minimum the operation of the attr interpretation merges the two sidewalk width’, AnFluchtlinie ‘along the alignment attribute trees with each other and subsequently with a line’), two numbers (10, 2.0), two occurrences of the unit tree of two nodes (OBL, GebaeudeHoeheMax). S-graph of measurement m, and the modality OBL ‘obligation’ (the algebras use three types of operations: the merge opera- latter based on the word form herzustellen composed of tion merges two graphs on nodes with matching sources the verb herstellen ‘construct, produce’ and the infinitive while the rename and forget operations can be used to marker zu). These eight elements are organized in a tree change or delete sources. In the example rule in Fig. 4 structure according to the order in which they appeared the two argument graphs are renamed so that their root in the IRTG derivation of the corresponding semantic sources become src and tgt, and these sources need to graph, shown in Fig. 5. The tree of attributes is stored be deleted after the merge operation. in a custom data structure that in every node stores the Once all relevant strings have been extracted from the length of the shortest path between any pair of attributes semantic graph, the next step is to build the rule represen- below that node. This means that by querying the root of the tree we can retrieve for any attribute a list of all other 5 Gabor Recski et al. CEUR Workshop Proceedings 1–13 These simple heuristics, which are already capable of correctly matching attributes to their values and for AnFluchtLinie distinguishing between the roles each attribute plays in a rule, allow us to keep the grammar simpler than it OBL would be if it was to directly generate structures like our GehsteigbreiteMin generic rule representations. The IRTG rules currently used (and exemplified in Fig. 4) simply represent one- to-one correspondences between 4lang subgraphs and strings, but the underlying Regular Tree Grammar only StrassenbreiteMin m 2.0 uses a single nonterminal symbol, i.e. rules are not sen- sitive to which other rules were used to construct their m 10 arguments. It would be quite straightforward to intro- duce non-terminal symbols representing attribute names, numbers, measurement units, and modalities, so that the IRTG itself would enforce the structure that is currently built in a postprocessing step. The tree in Figure 5 re- Figure 5: Example of attribute matching for the sentence sembles the order in which patterns corresponding to . . . bei einer Straßenbreite ab 10 m entlang der Fluchtlinien Gehsteige mit einer Breite von mindestens 2,0 m herzustellen each element were found in the semantic graph, which sind. ‘in case of a street width of 10 m or more, sidewalks in turn correspond to disjoint subgraphs of the semantic with a width of at least 2.0 m are to be constructed along the representation that are each connected to the concept alignment lines.’ herstellen and roughly correspond to fragments of the original sentence such as Gehsteig mit einer Breite von mindenstens 2,0 m herstellen ‘construct sidewalk with a width of at least 2.0 m’, bei einer Straßenbreite ab 10 m attributes ranked by their relative distance in the tree. We ‘with a road width of at least 10 m’, entlang der Fluchtlin- first match all units of measurement to the nearest value ien ‘along the alignment lines’, etc. Here we limit our in the tree, allowing each value to be associated with at grammar to the task of understanding each of these pat- most one unit of measurement. Next, all non-boolean at- terns independently because this proves sufficient for our tributes are matched to the nearest value, using a greedy purposes of constructing formal rules from sentences of algorithm: all possible attribute-value pairs are sorted the zoning map of the City of Vienna, which tend to be by their relative distance in the tree, the pair with the in a one-to-one correspondence with rules of the general shortest path is stored as a match, its members are re- structure described in Sec. 3.1. I.e., we greatly simplify moved from the lists of attributes and values that are still our task by exploiting the fact that authors of this piece of to be matched, and this step is repeated until at least one legislation rarely express a rule in multiple sentences or of the lists becomes empty. For example, in Fig. 5 the incorporate several rules in a single sentence. A notable attribute StrassenbreiteMin ‘minimum road width’ is exception is when some conditions such as the ID or des- paired with the value 10, since they are the closest of ignation of the areas that a rule refers to are not repeated any pairs of attribute and value in the tree (the attribute in every sentence within the same section. These condi- AnFluchtLinie is excluded from this process because it tions are propagated by a simple inheritence mechanism is listed as a boolean attribute). In the second step the at- that assumes the values of such attributes to hold within tribute GehsteigbreiteMin ‘minimum sidewalk width’ a single section (see Sec. 4 for how section boundaries is matched with the only remaining value, 2.0. Finally, are detected). the type of each attribute must be detected, i.e. it must be determined whether an attribute is a condition of the rule, part of the content, or an exception to either one of 3.3. Specification to dyadic deontic logic these (contentException, conditionException, see The extracted rules in the generic format are then trans- Sec. 3.1). Some attrbutes are explicitly listed as always be- lated into the language of dyadic deontic logic. We chose ing of type condition, e.g., Planzeichen and Widmung this particular framework because the sentences of the which refer to the ID and designation of an area. Next, the zoning map of Vienna exhibit a very clear deontic struc- extracted modality elements OBL, FOR, EXC are matched ture, in contrast, e.g., to the largely definitional character to the nearest of the remaining attributes, which are of the British Nationality Act investigated in [15]. The in turn determined to be part of the content (in case translation is dependent on the modality in the following of FOR and OBL) or a conditionException (in case of way. Given a rule representation EXC). Finally, all remaining attributes are given the type {”modality” : Modality, ”attributes” : List} condition. 6 Gabor Recski et al. CEUR Workshop Proceedings 1–13 let 4.1. Preprocessing and segmentation Cnd := cnd1 (cndV1 ) ∧ · · · ∧ cnd𝑛 (cndV𝑛 ) The input to our pipeline consists of PDF documents CntEx := ctEx1 (ctExV1 ) ∨ · · · ∨ ctEx𝑚 (ctExV𝑚 ) downloaded from the public website of the City of Vi- CndEx := cdEx1 (cdExV1 ) ∨ · · · ∨ cdExℓ (cdExVℓ ) enna. Each PDF document contains regulations pertain- ing to one zoning area (Plangebiet), indicated by a four- where the cnd𝑖 are all the attributes occurring with type digit ID. We discard the fraction of documents that are “condition” in List, and the cndV𝑖 are their respective scanned images of printed documents and do not con- values, and similarly for ctEx for type “contentException” tain machine-readable text data (253/1431 = 17.7%) — and cdEx for type “conditionException”. For the content, we could include these in our experiments by running let further optical character recognition (OCR). PDF documents are Cnt := cnt1 (cntV1 ) ∧ · · · ∧ cnt𝑚 (cntV𝑚 ) then converted to plain text using the pdftotext util- Cntfor := cnt1 (cntV1 ) ∨ · · · ∨ cnt𝑚 (cntV𝑚 ) ity, part of the open-source Poppler library10 . We use the −layout option of the tool to maintain page layout Where again the cnt𝑖 are the attributes of type “content” in the output text file, this greatly simplifies the subse- and the cntV𝑖 their respective values. The translation of quent extraction of document structure. Next we use a rule representation with modality “obligation” then is: a small set of regular expressions to establish section obl(Cnt ∨ CntEx, Cnd) ∧ per(¬(Cnt ∨ CntEx), CndEx) boundaries and extract section numbers from the text. Section numbering often makes use of several levels (e.g. The translation of a rule with modality “prohibition” is: 1, 1.1, 1.1.1, etc.), but this is not consistent across doc- for(Cntfor ∧ ¬CntEx, Cnd) ∧ per(Cntfor ∧ ¬CntEx, CndEx) uments, therefore we only consider top-level sections in subsequent steps that are sensitive to section bound- For the modality “permission” the translation is: aries. Besides the inheritance mechanism described in per(Cnt ∨ CntEx, Cnd) ∧ for(Cnt ∨ CntEx, CndEx) Section 3.2, this decision is crucial for sentence segmen- tation, the next step in our pipeline, for which we use Note that this translation commits to formalising, e.g., a customized version of the German sentence splitting condition exceptions to obligations or prohibitions as model of the stanza11 [13] library. The output of the stan- additional permissions. Of course this is by no means dard model is postprocessed to undo sentence splits that the only possible translation: we could have chosen to have been made in error (e.g. those after periods follow- embed the condition exception explicitly in the condi- ing abbreviations characteristic of legal text) and also tion of the resulting formula. This choice is due to the those made after colons (:) that separate a predicate from fact that it facilitates the derivation of a general state- its object(s), such as in the text Für die mit BB4 bezeich- ment like “Flat roofs should be green roofs” from “Flat neten Grundflächen wird bestimmt: Die Errichtung von roofs should be green roofs unless they are glass roofs” Gebäuden mit einer maximalen Gebäudehöhe von 8 m ist in the particular logic used in the prover. In particular, zulässig. ‘For areas marked BB4 it is determined: con- when checking whether a flat roof in general should be struction of buildings with a maximum building height a green roof we do not need to explicitly state that none of 8 m is allowed.’. The custom sentence segmentation of the condition exceptions are satisfied, in line with the step is followed by stanza’s default German pipeline standard approach in non-monotonic logic and default (de − gsd, stanza model version 1.1.0) for tokenization, reasoning [30]. part-of-speech (POS) tagging and universal dependency parsing. 4. Architecture 4.2. Semantic parsing We describe the system architecture of the full pipeline The next step in our pipeline is to construct semantic that takes raw text documents as input, builds semantic graphs from each sentence. The rule extraction algorithm graphs using the system described in Section 2.1, ex- described in Section 3 assumes that all relevant informa- tracts rules using our method presented in Section 3.2, tion present in the input text is available in the semantic maps them to deontic logic formulae as described in Sec- graph that is the output of the generic semantic parsing tion 3.3 and provides them as input to the prover (see pipeline described in Section 2.1. To ensure that this is the Section 2.2). All components of our system are available case some minor modifications of the semantic parsing al- as open-source software8 under an MIT license and the gorithm were also necessary. First, we introduced a small end-to-end pipeline is showcased in an online demo9 set of rules in the grammar mapping Universal Depen- integrating all of them. 8 10 https://github.com/recski/brise-plandok https://gitlab.freedesktop.org/poppler/poppler 9 11 https://ir-group.ec.tuwien.ac.at/brise-extract https://stanfordnlp.github.io/stanza 7 Gabor Recski et al. CEUR Workshop Proceedings 1–13 dency representations to semantic graphs for common to the output in parallel to the construction of the seman- words expressing negation and modality. The lemmas tic graph, it is this additional information that allows us nicht and kein trigger the addition of the NEG element to to implement the matching heuristics described in Sec- the 4lang graph, dürfen ‘may’ and zulässig ‘permitted’ tion 3.2. For parsing of 4lang graphs and generation of are mapped to PER, untersagen ‘prohibit‘ and unzulässig attribute trees with these IRTGs we use the open-source ‘not permitted’ to FOR, and müssen to OBL. Additionally, alto12 library, which also implements s-graph algebras the German construction consisting of the particle zu and tree algebras. The alto system also supports proba- followed by the infinitive form of a verb must also trigger bilistic parsing with weighted grammars, and we rely on the OBL element, since it can express modality without rule weights to ensure that rules which map subgraphs to any additional linguistic elements. This latter rule is im- attributes always take precedence over the ‘empty’ rules plemented by two mechanisms, one that looks for the that are only added to the grammar to ensure that the lemma zu with the universal part-of-speech tag (UPOS) full graph is derivable. In those few cases when more PART, the other for the language-specific part-of-speech than one such ‘content’ rule matches the same subgraph, tag (XPOS) VVIZU marking verbs that contain the par- precedence is given to rules that cover larger substruc- ticle as an infix (e.g. herzustellen from herstellen ‘create, tures. The trees output by the IRTG parser serve as the produce’. While even the most rudimentary treatment input to the heuristic construction of rules described in of the semantics of German modal expressions would the previous section. Finally, rules are converted from the go beyond the simplicity of such a simple categorization generic (JSON) format to the language of dyadic deontic (and the scope of this work), in practice this small en- logic, as described in Section 3.3. hancement of the semantic representation of the input text was sufficient to allow for the detection of modality 4.4. The prover by the rule extraction mechanism. Finally we also added an ad-hoc rule for detecting exceptions: the presence The final step in our pipeline consists of an exemplary rea- of the word sofern and soweit, both roughly equivalent soning mechanism to draw inferences from the extracted to the English conjunction ‘provided’ and introducing a rules. This step is based on our adaption13 of the generic theorem prover deonticProver2.014 which implements clause that limits the applicability of a previous statement, backwards proof search in a sequent system for a dyadic triggers the addition of an element EXC to the semantic deontic logic extended with rules for defeasibly reasoning graph which is then also available for processing by the from deontic assumptions [21]. Apart from specifying rule extraction mechanism. the prover to the language obtained from the examples we needed to further modify it in two ways. First, in order 4.3. Rule extraction to be able to handle attributes with numerical arguments, such as DachneigungMax for the maximal angle of the We now describe the implementation details of the two- roof, or with strings as argument, such as Dachart for step rule extraction method presented in Section 3.2. The the roof type, we extended the prover and the underly- output of the semantic parser, which serves as the input ing reasoning system to handle atomic propositions with to rule extraction, is a single directed graph for each in- arguments. In addition, we added ground sequents, i.e., put sentence, generated by an Interpreted Regular Tree structures which can be used as leaves in a derivation, cor- responding to basic properties of measure-like attributes Grammar from Universal Dependency structures (see with natural numbers as values: Where msr is a basic Section 2.1 for details). For recognizing subgraphs and attribute for a measure such as Dachneigung, we con- mapping them to attributes we also use an IRTG over sidered a triple consisting of the attributes msrGenau(𝑛), an algebra of s-graphs, this allows us to pipe the output msrMin(𝑛) and msrMax(𝑛), expressing the facts that msr of the semantic parser directly into our rule extraction is exactly 𝑛, at least 𝑛, or at most 𝑛, respectively. The grammar. For each 4lang graph we dynamically gener- relations between these three attributes are given by: ate a unique grammar. The static set of rules encoding • msrGenau(𝑛) → msrMin(𝑛) ∧ msrMax(𝑛) the correspondence between generic semantic structures • msrMin(𝑛) → msrMin(𝑚), where 𝑚 ≤ 𝑛 and task-specific attributes is extended with empty termi- nal rules for each concept in the input graph, this ensures • msrMax(𝑛) → msrMax(𝑚), where 𝑛 ≤ 𝑚 that the entire graph can be constructed by a sequence of • msrMax(𝑛) → ¬msrMin(𝑚), where 𝑛 < 𝑚 operations that is derivable by the underlying RTG and The ground sequents added to the prover then absorb ba- thus the object can be parsed by the IRTG. The output sic reasoning on these axioms, so that, e.g., the formulae interpretation of the IRTG is an algebra of trees, whose leaves are the individual strings that we use to construct ¬(DachneigungGenau(𝑛) ∧ DachneigungGenau(𝑚)) rules in a subsequent step. The trees resemble the order 12 https://github.com/coli-saar/alto in which these strings (names and values of attributes, 13 https://github.com/blellmann/BRISEprover modal elements, units of measurement) have been added 14 http://subsell.logic.at/bprover/deonticProver/version2.0/ 8 Gabor Recski et al. CEUR Workshop Proceedings 1–13 are derivable for 𝑛 ̸= 𝑚, stating that the exact angle of check once in a preprocessing stage, store for every de- a roof does not have two different values. ontic assumption the list of conflicting ones, and only Second, and more significantly, to be more in line with check that none of the assumptions in this list is appli- other approaches in the area of deontic reasoning such cable and more specific during the actual computation. as [31] as well as for efficiency reasons we modified the In our experiments this increased efficiency was neces- mechanism how the prover handles specificity reasoning sary for reasoning with a non-trivial number of deontic when reasoning from deontic assumptions. To illustrate, assumptions. To compare the two reasoning methods assume the deontic assumption the user can switch between the original (“classic”) and obl(DachneigungMax(5) ∧ BegruenungDach, modified (“modern”) versions on the web interface15 for (1) the prover. For the sake of simplicity the web interface Plangeb(7181)) for the whole pipeline only uses the modified version. stating that the maximal angle of the roof must be 5 Originally, for derivable input deonticProver2.0 out- degrees and the roof must be green under the condition puts a pdf file with a derivation in the calculus. How- that the building is in zone 7181. This would be partially ever, since the derivations can become rather large (even overruled by the additional more specific assumption breaking the maximal limit on object size in TeX) and the average user might not be acquainted with the specific obl(¬BegruenungDach, Plangeb(7181) (2) formalism used in the prover, we further extended the ∧Planzeichen(BB1)) output module with an option to print the derivation as stating that roofs in areas of zone 7181 marked with the an explanation in pseudo-natural language. Explanations label BB1 on the map must be not green roofs. The latter can be unfolded step by step by clicking on a button la- assumption is considered more specific than (1) because belled “Why?” after the “The statement ... is derivable.” its condition Plangeb(7181) ∧ Planzeichen(BB1) output. In unfolding the explanation, propositional steps strictly implies the condition Plangeb(7181) of assump- are skipped by default to reveal the crucial deontic state- tion (1). In deonticProver2.0 the assumption (1) could ments and assumed facts used there. These intermediary still be used to infer obligations from the part of its con- steps can additionally be unfolded by clicking on a “Why tent not in conflict with the content of the more specific does it follow from the above?” button. In the demo the assumption (2), such as user can select the output format. obl(¬DachneigungGenau(7), Plangeb(7181) We stress again that here our prover serves mainly as (3) an example for a possible reasoning mechanism and that ∧Planzeichen(BB1)) we do not claim that the underlying logic is necessarily stating that in areas of zone 7181 marked with the label the most appropriate. For this reason we also defer the BB1 the exact angle of the roof must not be 7 degrees. theoretical details of the modifications of the underlying In our prover we changed this behaviour so that any sequent system to a forthcoming companion paper. assumption which is a in conflict with a more specific applicable one cannot be used to derive any obligations. Thus (disregarding prohibition and permission opera- 5. Evaluation tors for the sake of exposition) to check whether (3) is derivable we now check whether there is an assumption The rule systems presented in Sec. 3.2 were developed obl(𝐴, 𝐵) such that based on a small annoted sample of sentences from 1. 𝐴 → ¬DachneigungGenau(7) is derivable the zoning plan of the City of Vienna. In order to es- tablish a representative sample, we started by estimat- 2. Plangeb(7181) ∧ Planzeichen(BB1) → 𝐵 is deriv- ing the distribution of attributes in the entire corpus able by manually labeling the sentences of 10 randomly se- 3. there is no applicable and more specific assumption lected documents with the attributes they mention (ei- conflicting with obl(𝐴, 𝐵), i.e., there is no obl(𝐶, 𝐷) ther as condition or content). This sample contains 344 such that mentions of attributes in 193 sentences (as well as 118 a) Plangeb(7181) ∧ Planzeichen(BB1) → 𝐷 sentences without attribute mentions, mostly from the and 𝐷 → 𝐵 are derivable preambles). The number of unique attributes in the sam- b) 𝐶 → ¬𝐴 is derivable ple is 84, but 193 of the 344 instances (56%) come from Crucially, the “no-conflict” check in item (3b) above only the 16 most frequent attributes. We then chose 6 sen- needs to be performed between two assumptions and tences from this sample that together contain mentions not between the formula to be proved and an assump- of 7 of these 16 attributes, including the 3 most fre- tion. This means that instead of checking for conflicts quent ones (GebaeudeHoeheMax, AbschlussDachMax, many times redundantly in the search for a derivation GebaeudeHoeheArt) that are alone responsible for 17% (as is done in deonticProver2.0) it suffices to perform this 15 http://subsell.logic.at/bprover/briseprover/ 9 Gabor Recski et al. CEUR Workshop Proceedings 1–13 of all attribute mentions in the larger sample. We anno- an appropriate translation. Second, our propositional tated these 6 sentences with the full representation of all language is currently rather restricted, since we do not rules stated by them and developed our rule extraction permit, e.g., disjunctions in the conditions or content system to achieve perfect performance on this toy corpus. of an obligation. Again, this could be addressed rather Both this fully annotated set and the larger sample of straightforwardly by extending the format of our repre- 10 documents annotated for attribute mentions only are sentation, possibly along the lines of JsonLogic17 . We also released along with the software16 . While our method of do not consider quantification or nested deontic opera- selecting the sentences for the toy corpus ensures that tors. For the current application these features seemed the attribute extraction step of our method has high cov- not to be necessary. Most of these limitations are in line erage (recall above 51% with a precision above 93% on with other current approaches, e.g., [16, 28]. the sample of 10 documents and 344 attribute instances), The proof-of-concept application presented in this pa- this cannot be considered as quantitative evaluation of per can serve as a blueprint for semantics-based solutions the full rule extraction pipeline. The limited amount to a wide range of information extraction tasks includ- of annotated data also does not permit any conclusions ing variants of entity recognition and relation extraction. about the effect of errors in syntactic parsing made by Such systems are generally more flexible, interpretable, the stanza model, but our assumption that this should not and less prone to bias than the large neural network mod- become a bottleneck for such standard text is reinforced els used for similar tasks. However, to make such systems by the fact that we did not observe any such errors in our a viable alternative for everyday NLP applications, novel sample. A larger-scale annotation of attribute mentions methods must be devised for the (semi-)automatic learn- is currently in progress. ing of task-specific rule systems like the one manually built for this project. Concerning the automated reason- ing part, we plan to consider specifications to different 6. Discussion frameworks in the future, including those of argumen- tation theory [27], I/O logic [28], and defeasible deontic In this article we have presented a system for extract- logic [16], and integrate existing provers for these for- ing formal rules from legal text using generic semantic malisms such as TOAST18 , SPINdle19 or TurnipBox20 . Ad- parsing and domain-specific pattern-matching, and con- ditionally, we plan to implement alternative translations verting them to deontic logic for use in an automated from the generic representation to the language of dyadic reasoning system. All components of the pipeline, includ- deontic logic, corresponding to different interpretations ing those contributed in this paper, are made available of the logical structure of deontic statements. Along the as open-source software under the MIT license, for un- lines of [35] this could be used to compare such differ- restricted use in future applications. Unlike machine ent interpretations. Finally, we would like to investigate learning based information extraction systems, our rule whether the part of our pipeline creating general rule extraction model is fully explainable and serves as an representations could be used in combination with the example for a specific application of semantic parsing NAI suite [17]. Our rule-based approach could be used as to domain-specific information extraction. While the se- a first step to automatically suggest a formalisation of a mantic representation and parsing algorithms used in given legal text, which then could be converted into the our pipeline are language-agnostic, they may require format used in the NAI suite and run through the quality adaptation to new languages and domains. Furthermore, assurance function provided there. The benefit would be for domains and text genres that more closely resemble that the legal experts do not need to actively formulate everyday language use, deep semantic analysis would the formalisation of a legal text, but only to potentially require lexical inference, a notoriously difficult task in adjust it based on the quality assurance checks. computational semantics [32, 33]. In our general rule representation we concentrated on deontic statements of a reasonably simple form. While this form seems to be Acknowledgments well adapted to the regulations provided in the texts for the zoning maps of Vienna, there are some obvious limi- We are grateful to the three anonymous reviewers for tations. First, since we always assume the presence of a their suggestions and for additional references. Work deontic modality (obligation, prohibition or permission), supported by BRISE-Vienna (UIA04-081), a European at the moment we cannot treat constitutive norms [34] Union Urban Innovative Actions project. such as “The area marked on the map with the label BB1 is designated a residential area”. This issue could be addressed by adding an additional modality “consitu- 17 https://jsonlogic.com tiveNorm” to the general representation together with 18 http://toast.arg-tech.org/ 19 http://spindle.data61.csiro.au/spindle/ 16 20 https://github.com/recski/brise-plandok https://turnipbox.netlify.app 10 Gabor Recski et al. CEUR Workshop Proceedings 1–13 References sociation for Computational Linguistics, Online, 2020, pp. 40–52. URL: https://www.aclweb.org/ [1] L. Banarescu, C. Bonial, S. Cai, M. Georgescu, anthology/2020.conll-shared.4. doi:10.18653/v1/ K. Griffitt, U. Hermjakob, K. Knight, P. Koehn, 2020.conll-shared.4. M. Palmer, N. Schneider, Abstract Meaning Rep- [8] D. Samuel, M. Straka, ÚFAL at MRP 2020: resentation for sembanking, in: Proceedings of Permutation-invariant semantic parsing in PERIN, the 7th Linguistic Annotation Workshop and Inter- in: Proceedings of the CoNLL 2020 Shared Task: operability with Discourse, Association for Com- Cross-Framework Meaning Representation Pars- putational Linguistics, Sofia, Bulgaria, 2013, pp. ing, Association for Computational Linguistics, On- 178–186. URL: https://www.aclweb.org/anthology/ line, 2020, pp. 53–64. URL: https://www.aclweb.org/ W13-2322. anthology/2020.conll-shared.5. doi:10.18653/v1/ [2] C. Lyu, I. Titov, AMR parsing as graph pre- 2020.conll-shared.5. diction with latent alignment, in: Proceedings [9] A. Kornai, The algebra of lexical semantics, in: of the 56th Annual Meeting of the Association C. Ebert, G. Jäger, J. Michaelis (Eds.), Proceedings of for Computational Linguistics (Volume 1: Long the 11th Mathematics of Language Workshop, LNAI Papers), Association for Computational Linguis- 6149, Springer, 2010, pp. 174–199. doi:10.5555/ tics, Melbourne, Australia, 2018, pp. 397–407. 1886644.1886658. URL: https://www.aclweb.org/anthology/P18-1037. [10] G. Recski, Building concept definitions from ex- doi:10.18653/v1/P18-1037. planatory dictionaries, International Journal of Lex- [3] S. Zhang, X. Ma, K. Duh, B. Van Durme, AMR icography 31 (2018) 274–311. doi:10.1093/ijl/ parsing as sequence-to-graph transduction, in: Pro- ecx007. ceedings of the 57th Annual Meeting of the Associa- [11] Á. Kovács, K. Gémes, A. Kornai, G. Recski, tion for Computational Linguistics, Association for BMEAUT at SemEval-2020 task 2: Lexical en- Computational Linguistics, Florence, Italy, 2019, pp. tailment with semantic graphs, in: Proceed- 80–94. URL: https://www.aclweb.org/anthology/ ings of the Fourteenth Workshop on Semantic P19-1009. doi:10.18653/v1/P19-1009. Evaluation, International Committee for Compu- [4] O. Abend, A. Rappoport, Universal Conceptual tational Linguistics, Barcelona (online), 2020, pp. Cognitive Annotation (UCCA), in: Proceedings 135–141. URL: https://www.aclweb.org/anthology/ of the 51st Annual Meeting of the Association 2020.semeval-1.15. for Computational Linguistics (Volume 1: Long [12] J. Nivre, M. Abrams, Ž. Agić, L. Ahrenberg, L. An- Papers), Association for Computational Linguis- tonsen, K. Aplonova, M. J. Aranzabe, G. Arutie, tics, Sofia, Bulgaria, 2013, pp. 228–238. URL: https: M. Asahara, L. Ateyah, M. Attia, A. Atutxa, L. Au- //www.aclweb.org/anthology/P13-1023. gustinus, E. Badmaeva, M. Ballesteros, E. Baner- [5] D. Hershcovich, O. Abend, A. Rappoport, A jee, S. Bank, V. Barbu Mititelu, V. Basmov, J. Bauer, transition-based directed acyclic graph parser for S. Bellato, K. Bengoetxea, Y. Berzak, I. A. Bhat, UCCA, in: Proceedings of the 55th Annual R. A. Bhat, E. Biagetti, E. Bick, R. Blokland, V. Bo- Meeting of the Association for Computational bicev, C. Börstell, C. Bosco, G. Bouma, S. Bow- Linguistics (Volume 1: Long Papers), Associa- man, A. Boyd, A. Burchardt, M. Candito, B. Caron, tion for Computational Linguistics, Vancouver, G. Caron, G. Cebiroğlu Eryiğit, F. M. Cecchini, Canada, 2017, pp. 1127–1138. URL: https://www. G. G. A. Celano, S. Čéplö, S. Cetin, F. Chalub, J. Choi, aclweb.org/anthology/P17-1104. doi:10.18653/ Y. Cho, J. Chun, S. Cinková, A. Collomb, Ç. Çöl- v1/P17-1104. tekin, M. Connor, M. Courtin, E. Davidson, M.- [6] D. Hershcovich, O. Abend, A. Rappoport, Multi- C. de Marneffe, V. de Paiva, A. Diaz de Ilarraza, task parsing across semantic representations, in: C. Dickerson, P. Dirix, K. Dobrovoljc, T. Dozat, Proceedings of the 56th Annual Meeting of the As- K. Droganova, P. Dwivedi, M. Eli, A. Elkahky, sociation for Computational Linguistics (Volume 1: B. Ephrem, T. Erjavec, A. Etienne, R. Farkas, Long Papers), Association for Computational Lin- H. Fernandez Alcalde, J. Foster, C. Freitas, K. Gaj- guistics, Melbourne, Australia, 2018, pp. 373–385. došová, D. Galbraith, M. Garcia, M. Gärdenfors, URL: https://www.aclweb.org/anthology/P18-1035. S. Garza, K. Gerdes, F. Ginter, I. Goenaga, K. Go- doi:10.18653/v1/P18-1035. jenola, M. Gökırmak, Y. Goldberg, X. Gómez Guino- [7] H. Ozaki, G. Morio, Y. Koreeda, T. Morishita, vart, B. Gonzáles Saavedra, M. Grioni, N. Grūzı̄tis, T. Miyoshi, Hitachi at MRP 2020: Text- B. Guillaume, C. Guillot-Barbance, N. Habash, J. Ha- to-graph-notation transducer, in: Proceed- jič, J. Hajič jr., L. Hà Mỹ, N.-R. Han, K. Harris, ings of the CoNLL 2020 Shared Task: Cross- D. Haug, B. Hladká, J. Hlaváčová, F. Hociung, Framework Meaning Representation Parsing, As- P. Hohle, J. Hwang, R. Ion, E. Irimia, O.. Ishola, 11 Gabor Recski et al. CEUR Workshop Proceedings 1–13 T. Jelínek, A. Johannsen, F. Jørgensen, H. Kaşıkara, ings of the 58th Annual Meeting of the Associa- S. Kahane, H. Kanayama, J. Kanerva, B. Katz, tion for Computational Linguistics: System Demon- T. Kayadelen, J. Kenney, V. Kettnerová, J. Kirchner, strations, Association for Computational Linguis- K. Kopacewicz, N. Kotsyba, S. Krek, S. Kwak, V. Laip- tics, Online, 2020, pp. 101–108. URL: https://www. pala, L. Lambertino, L. Lam, T. Lando, S. D. Larasati, aclweb.org/anthology/2020.acl-demos.14. doi:10. A. Lavrentiev, J. Lee, P. Lê Hồng, A. Lenci, S. Lertpra- 18653/v1/2020.acl-demos.14. dit, H. Leung, C. Y. Li, J. Li, K. Li, K. Lim, N. Ljubešić, [14] A. Koller, Semantic construction with graph gram- O. Loginova, O. Lyashevskaya, T. Lynn, V. Macke- mars, in: Proceedings of the 11th International tanz, A. Makazhanov, M. Mandl, C. Manning, R. Ma- Conference on Computational Semantics, Associ- nurung, C. Mărănduc, D. Mareček, K. Marheinecke, ation for Computational Linguistics, London, UK, H. Martínez Alonso, A. Martins, J. Mašek, Y. Mat- 2015, pp. 228–238. URL: https://www.aclweb.org/ sumoto, R. McDonald, G. Mendonça, N. Miekka, anthology/W15-0127. M. Misirpashayeva, A. Missilä, C. Mititelu, Y. Miyao, [15] M. Sergot, F. Sadri, R. Kowalski, F. Kriwaczek, S. Montemagni, A. More, L. Moreno Romero, K. S. P. Hammond, H. Cory, The British Nationality Act Mori, S. Mori, B. Mortensen, B. Moskalevskyi, as a logic program, Communications of the ACM K. Muischnek, Y. Murawaki, K. Müürisep, P. Nain- 29 (1986). wani, J. I. Navarro Horñiacek, A. Nedoluzhko, [16] G. Governatori, Practical normative reasoning G. Nešpore-Bērzkalne, L. Nguyễn Thi., H. Nguyễn with defeasible deontic logic, in: C. d’Amato, Thi. Minh, V. Nikolaev, R. Nitisaroj, H. Nurmi, M. Theobald (Eds.), Reasoning Web 2018, volume S. Ojala, M. Olúòkun, Adédayo.and Omura, P. Osen- 11078 of LNCS, Springer, 2018, pp. 1–25. ova, R. Östling, L. Øvrelid, N. Partanen, E. Pas- [17] T. Libal, A. Steen, NAI: Towards transparent and cual, M. Passarotti, A. Patejuk, G. Paulino- usable semi-automated legal analysis, in: IRIS 2020, Passos, S. Peng, C.-A. Perez, G. Perrier, S. Petrov, Editions Weblaw, 2020, pp. 265–272. J. Piitulainen, E. Pitler, B. Plank, T. Poibeau, [18] S. Batsakis, G. Baryannis, G. Governatori, I. Tach- M. Popel, L. Pretkalnin, a, S. Prévost, P. Proko- mazidis, G. Antoniou, Legal representation and pidis, A. Przepiórkowski, T. Puolakainen, S. Pyysalo, reasoning in practice: A critical comparison, in: A. Rääbis, A. Rademaker, L. Ramasamy, T. Rama, JURIX 2018, IOS Press, 2018, pp. 31–40. URL: https: C. Ramisch, V. Ravishankar, L. Real, S. Reddy, //doi.org/10.3233/978-1-61499-935-5-31. G. Rehm, M. Rießler, L. Rinaldi, L. Rituma, L. Rocha, [19] R. Calegari, G. Contissa, F. Lagioia, A. Omicini, M. Romanenko, R. Rosa, D. Rovati, V. Ros, ca, G. Sartor, Defeasible systems in legal reasoning: A O. Rudina, J. Rueter, S. Sadde, B. Sagot, S. Saleh, comparative assessment, in: JURIX 2019, IOS Press, T. Samardžić, S. Samson, M. Sanguinetti, B. Saulı̄te, 2019, pp. 169–174. doi:10.3233/FAIA190320. Y. Sawanakunanon, N. Schneider, S. Schuster, [20] T. Libal, A meta-level annotation language D. Seddah, W. Seeker, M. Seraji, M. Shen, A. Shi- for legal texts, in: M. Dastani, H. Dong, mada, M. Shohibussirri, D. Sichinava, N. Silveira, L. van der Torre (Eds.), CLAR 2020, volume 12061 of M. Simi, R. Simionescu, K. Simkó, M. Šimková, LNCS, Springer, 2020, pp. 131–150. doi:10.1007/ K. Simov, A. Smith, I. Soares-Bastos, C. Spadine, 978-3-030-44638-3_9. A. Stella, M. Straka, J. Strnadová, A. Suhr, U. Su- [21] A. Ciabattoni, B. Lellmann, Sequent rules for rea- lubacak, Z. Szántó, D. Taji, Y. Takahashi, T. Tanaka, soning and conflict resolution in conditional norms, I. Tellier, T. Trosterud, A. Trukhina, R. Tsarfaty, in: F. Liu, A. Marra, P. Portner, F. V. D. Putte (Eds.), F. Tyers, S. Uematsu, Z. Urešová, L. Uria, H. Uszko- DEON 2020/2021, College Publications, 2021. reit, S. Vajjala, D. van Niekerk, G. van Noord, [22] D. Merigoux, N. Chataing, J. Protzenko, Catala: V. Varga, E. Villemonte de la Clergerie, V. Vincze, A programming language for the law, CoRR L. Wallin, J. X. Wang, J. N. Washington, S. Williams, abs/2103.03198 (2021). URL: https://arxiv.org/abs/ M. Wirén, T. Woldemariam, T.-s. Wong, C. Yan, 2103.03198. arXiv:2103.03198. M. M. Yavrumyan, Z. Yu, Z. Žabokrtský, A. Zeldes, [23] N. O. Nawari, A generalized adaptive frame- D. Zeman, M. Zhang, H. Zhu, Universal depen- work (GAF) for automating code compliance dencies 2.3, 2018. URL: http://hdl.handle.net/11234/ checking, Buildings 9 (2019). URL: https:// 1-2895, LINDAT/CLARIN digital library at the In- www.mdpi.com/2075-5309/9/4/86. doi:10.3390/ stitute of Formal and Applied Linguistics (ÚFAL), buildings9040086. Faculty of Mathematics and Physics, Charles Uni- [24] A. Sleimi, N. Sannier, M. Sabetzadeh, L. C. Briand, versity. M. Ceci, J. Dann, An automated framework for the [13] P. Qi, Y. Zhang, Y. Zhang, J. Bolton, C. D. Man- extraction of semantic legal metadata from legal ning, Stanza: A python natural language processing texts, Empirical Software Engineering 26 (2021) 43. toolkit for many human languages, in: Proceed- doi:10.1007/s10664-020-09933-5. 12 Gabor Recski et al. CEUR Workshop Proceedings 1–13 [25] J. Morris, Blawx: Rules as code demonstration, MIT Computational Law Report (2020). URL: https://law. mit.edu/pub/blawxrulesascodedemonstration. [26] J. Zhang, N. M. El-Gohary, Automated information transformation for automated regulatory compli- ance checking in construction, Journal of Com- puting in Civil Engineering 29 (2015) B4015001. doi:10.1061/(ASCE)CP.1943-5487.0000427. [27] S. Modgil, H. Prakken, The ASPIC+ framework for structured argumentation: A tutorial, Argument and Computation 5 (2014) 31–62. URL: http://dx.doi. org/10.1080/19462166.2013.869766. [28] X. Parent, L. van der Torre, Input/output logic, in: D. Gabbay, J. Horty, X. Parent, R. van der Meyden, L. van der Torre (Eds.), Handbook of Deontic Logic and Normative Systems, College Publications, 2013, pp. 495–544. [29] A. Koller, M. Kuhlmann, A generalized view on parsing and translation, in: Proceedings of the 12th International Conference on Parsing Technologies, Association for Computational Linguistics, Dublin, Ireland, 2011, pp. 2–13. URL: https://www.aclweb. org/anthology/W11-2902. [30] R. Reiter, A logic for default reasoning, Artificial Intelligence (1980). [31] J. F. Horty, Reasons as Defaults, Oxford University Press, 2012. [32] A. Talman, S. Chatzikyriakidis, Testing the gener- alization power of neural network models across NLI benchmarks, in: Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Inter- preting Neural Networks for NLP, Association for Computational Linguistics, Florence, Italy, 2019, pp. 85–94. URL: https://www.aclweb.org/anthology/ W19-4810. doi:10.18653/v1/W19-4810. [33] M. Schmitt, H. Schütze, Language models for lexical inference in context, in: Proceedings of the 16th Conference of the European Chapter of the Associ- ation for Computational Linguistics: Main Volume, Association for Computational Linguistics, Online, 2021, pp. 1267–1280. URL: https://www.aclweb.org/ anthology/2021.eacl-main.108. [34] G. Boella, L. W. N. van der Torre, Regulative and constitutive norms in normative multiagent sys- tems, in: D. Dubois, C. A. Welty, M. Williams (Eds.), KR2004, AAAI Press, 2004, pp. 255–266. URL: http: //www.aaai.org/Library/KR/2004/kr04-028.php. [35] B. Lellmann, F. Gulisano, A. Ciabattoni, Mı̄mām . sā deontic reasoning using speci- ficity: A proof theoretic approach, Artificial Intelligence and Law (published online 2020). doi:10.1007/s10506-020-09278-w. 13