A Tool for Encoding Controlled Natural Language Specifications as ASP Rules Carmine Dodaro1 , Marco Maratea2 and Francesco Riccio1 1 DeMaCS, University of Calabria, Italy 2 DIBRIS, University of Genoa, Italy Abstract Answer Set Programming (ASP) is a popular declarative programming language for solving hard com- binatorial problems. Albeit ASP has been widely adopted in both academic and industrial contexts, it might be difficult for people who are not familiar with logic programming conventions to use it. In this paper, we propose a translation of English sentences expressed in a controlled natural language (CNL) form into ASP. In particular, we first provide a definition of the type of sentences allowed by our CNL and their translation as ASP rules, and then exemplify the usage of CNL for the specification of well-known combinatorial problems. Keywords Answer set programming, Logic Programming, Controlled Natural Language 1. Introduction Answer Set Programming (ASP) [1] is a well-known declarative programming paradigm geared toward solving hard combinatorial problems. As a matter of fact, ASP has been widely used for solving problems in both academic and industrial contexts [2]. The success of ASP is mainly due to its simple syntax and intuitive semantics, and to the availability of efficient systems [3, 4]. However, despite its success, the usage of ASP is still problematic for operators without mathematical or logical background. In this paper, we propose a system that automatically translates English sentences expressed in a controlled natural language (CNL) [5] into ASP rules. Such a CNL is oriented towards three different types of use cases, i.e., (i) offering a tool to non-ASP experts for specifying ASP programs; (ii) offering a fast prototyping tool to ASP experts for creating intuitive ASP encodings which are subsequently subject to optimization; and (iii) improving the readability of ASP programs, since there is a one-to-one mapping between ASP rules and English specifications. In particular, the CNL presented in this paper is inspired by the Semantics of Business Vocabulary and Business Rules (SBVR) [6, 7], which is a standard proposed by the Object Management Group to formally describe complex entities, e.g. the ones related to a business, using natural language. Previous attempts of translating CNL into ASP were proposed in [8, 9], and more recently, by Schwitter in [10] who defined the language PENGASP . Albeit Datalog 2.0 2022: 4th International Workshop on the Resurgence of Datalog in Academia and Industry, September 05, 2022, Genova - Nervi, Italy $ carmine.dodaro@unical.it (C. Dodaro); marco.maratea@unige.it (M. Maratea); riccio.francesco@outlook.it (F. Riccio) Β© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) 188 Carmine Dodaro et al. CEUR Workshop Proceedings 188–201 some aspects of the grammar rules of PENGASP are also present in the grammar of our CNL, we propose a language that is oriented in the definition of combinatorial problems in a natural- feeling, but that is also reliable in its translation to ASP, avoiding using more than one grammar construct for the same behaviour and choosing words with an easily deducible meaning from the given context. Moreover, it should be noted that the implementation of PENGASP , as well as a binary executable, is not yet public, whereas the implementation of the system subject of this paper is open source and publicly-available at https://github.com/dodaro/cnl2asp. 2. Preliminaries Answer Set Programming (ASP) [1] is a programming paradigm developed in the field of non- monotonic reasoning and logic programming. In this section, we overview the language of ASP [11]. Syntax. The syntax of ASP is similar to the one of Prolog. Variables are strings starting with an uppercase letter, and constants are non-negative integers or strings starting with lowercase letters. A term is either a variable or a constant. A standard atom is an expression p(t1 ,. . .,t𝑛 ), where p is a predicate of arity 𝑛 and t1 ,. . .,t𝑛 are terms. An atom p(t1 ,. . .,t𝑛 ) is ground if t1 ,. . .,t𝑛 are constants. A ground set is a set of pairs of the form βŸ¨π‘π‘œπ‘›π‘ π‘‘π‘  : π‘π‘œπ‘›π‘—βŸ©, where π‘π‘œπ‘›π‘ π‘‘π‘  is a list of constants and π‘π‘œπ‘›π‘— is a conjunction of ground standard atoms. A symbolic set is a set specified syntactically as {Terms1 : Conj1 ; . . ., Terms𝑑 : Conj𝑑 } where 𝑑 > 0, and for all 𝑖 ∈ [1, 𝑑], each Terms𝑖 is a non-empty list of terms, and each Conj𝑖 is a non-empty conjunction of standard atoms. A set term is either a symbolic set or a ground set. Intuitively, a set term {X:a(X,c),p(X); Y:b(Y,m)} stands for the union of two sets: the first one contains the X- values making the conjunction a(X,c), p(X) true, and the second one contains the Y-values making the conjunction b(Y,m) true. An aggregate function is of the form 𝑓 (𝑆), where 𝑆 is a set term, and 𝑓 is an aggregate function symbol. Basically, aggregate functions map multisets of constants to a constant. The most common functions implemented in ASP systems are #count, for counting number of terms; and #sum, for computing sum of integers [12]. An aggregate atom is of the form 𝑓 (𝑆) β‰Ί 𝑇 , where 𝑓 (𝑆) is an aggregate function, β‰Ί ∈ {<, <=, >, >=, !=, =} is a comparison operator, and 𝑇 is a term called guard. An aggregate atom 𝑓 (𝑆) β‰Ί 𝑇 is ground if 𝑇 is a constant and 𝑆 is a ground set. An atom is either a standard atom or an aggregate atom. A rule π‘Ÿ has the following form: a1 | . . . | a𝑛 :- b1 , . . ., bπ‘˜ , not bπ‘˜+1 , . . ., not bπ‘š . where a1 ,. . .,a𝑛 are standard atoms (with 𝑛 β‰₯ 0), b1 ,. . .,bπ‘˜ are atoms, and bπ‘˜+1 ,. . .,bπ‘š are standard atoms (with π‘š β‰₯ π‘˜ β‰₯ 0). A literal is either a standard atom a or its negation not a. The disjunction a1 | . . . | a𝑛 is the head of π‘Ÿ, while the conjunction b1 , . . ., bπ‘˜ , not bπ‘˜+1 , . . ., not bπ‘š is its body. Rules with empty body are called facts. Rules with empty head are called constraints. A variable that appears uniquely in set terms of a rule π‘Ÿ is said to be local in π‘Ÿ, otherwise it is a global variable of π‘Ÿ. An ASP program is a set of safe rules, where a rule π‘Ÿ is safe if the following conditions hold: (i) for each global variable 𝑋 of π‘Ÿ there is a positive standard atom β„“ in the body of π‘Ÿ such that 𝑋 appears in β„“; and (ii) each local variable of π‘Ÿ appearing in a 189 Carmine Dodaro et al. CEUR Workshop Proceedings 188–201 symbolic set {Terms : Conj} also appears in a positive atom in Conj. A weak constraint πœ” [13] is of the form: :∼ b1 , . . ., bπ‘˜ , not bπ‘˜+1 , . . ., not bπ‘š . [w@l, t1 , . . ., t𝑧 ] where t1 , . . ., t𝑧 are terms, w and l are the weight and level of πœ”, respectively. Intuitively, [w@l] is read β€œas weight w at level l”, where weight is the β€œcost” of violating the condition in the body of w, whereas levels can be specified for defining a priority among preference criteria. An ASP program with weak constraints is Ξ  = βŸ¨π‘ƒ, π‘Š ⟩, where 𝑃 is a program and π‘Š is a set of weak constraints. A standard atom, a literal, a rule, a program or a weak constraint is ground if no variables appear in it. Semantics. Let 𝑃 be an ASP program. The Herbrand universe π‘ˆπ‘ƒ and the Herbrand base 𝐡𝑃 of 𝑃 are defined as usual. The ground instantiation 𝐺𝑃 of 𝑃 is the set of all the ground instances of rules of 𝑃 that can be obtained by substituting variables with constants from π‘ˆπ‘ƒ . An interpretation 𝐼 for 𝑃 is a subset 𝐼 of 𝐡𝑃 . A ground standard atom p is true w.r.t. 𝐼 if p ∈ 𝐼 , and false otherwise. A literal not p is true w.r.t. 𝐼 if p is false w.r.t. 𝐼, and false otherwise. An aggregate atom is true w.r.t. 𝐼 if the evaluation of its aggregate function (i.e., the result of the application of 𝑓 on the multiset 𝑆) with respect to 𝐼 satisfies the guard; otherwise, it is false. A ground rule π‘Ÿ is satisfied by 𝐼 if at least one atom in the head is true w.r.t. 𝐼 whenever all conjuncts of the body of π‘Ÿ are true w.r.t. 𝐼. A model is an interpretation that satisfies all rules of a program. Given a ground program 𝐺𝑃 and an interpretation 𝐼, the reduct of 𝐺𝑃 w.r.t. 𝐼 is the subset 𝐺𝐼𝑃 of 𝐺𝑃 obtained by deleting from 𝐺𝑃 the rules in which a body literal is false w.r.t. 𝐼 [12]. An interpretation 𝐼 for 𝑃 is an answer set (or stable model) for 𝑃 if 𝐼 is a minimal model (under subset inclusion) of 𝐺𝐼𝑃 (i.e., 𝐼 is a minimal model for 𝐺𝐼𝑃 ). Given a program with weak constraints Ξ  = βŸ¨π‘ƒ, π‘Š ⟩ and an interpretation 𝐼, the semantics of Ξ  extends from the basic case defined above. Thus, let 𝐺𝑃 and πΊπ‘Š be the instantiation of 𝑃 and π‘Š , respectively. Then, let πΊπΌπ‘Š be the set {(w@l,t1 ,. . .,t𝑧 ) | :∼ b1 ,. . .,bπ‘˜ ,not bπ‘˜+1 ,. . .,not bπ‘š . [w@l,t1 ,. . .,t𝑧 ] ∈ πΊπ‘Š π‘Žπ‘›π‘‘ b1 ,. . .,bπ‘š ∈ 𝐼}. Moreover, for an integer 𝑙, 𝑃𝑙𝐼 = (𝑀@𝑙,𝑑1 ,...,𝑑𝑧 )∈𝐺𝐼 𝑀 if there is at least one tuple in πΊπΌπ‘Š whose βˆ‘οΈ€ π‘Š level is equal to 𝑙, 0 otherwise. Given a program with weak constraints Ξ  = βŸ¨π‘ƒ, π‘Š ⟩, an answer set 𝑀 for 𝑃 is said to be dominated by an answer set 𝑀 β€² for 𝑃 , if there exists an integer 𝑙 such β€² β€² that 𝑃𝑙𝑀 < 𝑃𝑙𝑀 and 𝑃𝑙𝑀 β€² = 𝑃𝑙𝑀 β€² for all integers 𝑙′ > 𝑙. An answer set 𝑀 for 𝑃 is said to be optimal or optimum for Ξ  if there is no other answer set 𝑀 β€² that dominates 𝑀 [11]. Syntactic shortcuts. In the following, p(1..n). denotes the set of facts p(1). . . . p(n). Moreover, we use choice rules of the form {X}, where X is a set of atoms. Choice rules of this kind can be viewed as a syntactic shortcut for the rule p | p’. for each p ∈ X, where p’ is a fresh new atom not appearing elsewhere in the program, meaning that the atom 𝑝 can be chosen as true. Choice rules can also have bounds, i.e., {X} = 1, can be seen as a shortcut for the choice rule {X} and the rule :- #count{X} != 1. 190 Carmine Dodaro et al. CEUR Workshop Proceedings 188–201 3. CNL2ASP This section deals with the implementation of the grammar for the Controlled Natural Language (CNL) presented in this work. A specification written in this CNL is made of propositions, the structure of which is defined by clauses, linked by connectives, that are used to express concepts, query them for information or express conditions on them. Concepts in a proposition can be subjects, that represent the focus of the definition and only one of them is allowed per proposition, or companion objects, that represent other information associated with the subject of the proposition. The combination of clauses that produces a proposition defines its type, that is used to understand what the proposition is supposed to mean and how that meaning can be translated into ASP rules and facts. In the following subsections, we show the proposition types that compose the CNL and related sub-types, if any, along with a visualization of their translation into the elements that form the resulting ASP programs. 3.1. Grammar A brief description of the grammar is reported in the following, we refer the reader to [14] for the full grammar. In particular, our CNL is made of one or many propositions: 1 start βˆ’β†’ (( negative_strong_constraint_proposition 2 | positive_strong_constraint_proposition 3 | weak_constraint_proposition 4 | definition_proposition 5 | quantified_choice_proposition)CNL_END_OF_LINE)+ where each proposition is terminated by CNL_END_OF_LINE (in our case, a dot). We consider five types of propositions, namely negative strong constraints, positive strong constraints, weak constraints, definition clauses, and quantified choice clauses. Negative strong constraints are as follows: 1 negative_strong_constraint_proposition βˆ’β†’ "it is prohibited that" ( simple_clause ("and also" simple_clause)* (where_clause)? | aggregate_clause comparison_clause (where_clause)? | quantified_constraint (where_clause)?) In particular, they start with the sentence "it is prohibited that" and are always followed by a simple clause, i.e. a sentence of the form subject verb object, by an aggregate clause, i.e. a sentence expressing a form of aggregations (e.g. the number of) or by a quantified constraint, i.e. a sentence used to specify clauses with quantifiers as every or any. After that, additional sentences can be added, which can be of the type (i) where_clause; or (ii) comparison_clause. Sentences of type (i) are used to specify conditions; sentences of type (ii) are used to specify comparison between elements (e.g. X is equal to 1). Positive strong constraints are as follows: 1 positive_strong_constraint_proposition βˆ’β†’ "it is required that" ( aggregate_clause comparison_clause (where_clause)? | when_then_clause ( where_clause)? | quantified_constraint (where_clause)?) 191 Carmine Dodaro et al. CEUR Workshop Proceedings 188–201 In particular, they start with the sentence "it is required that" and are followed by an aggregate clause, by a quantified constraint, or by a clause of the type "when"/"then". Weak constraints are as follows: 1 weak_constraint_proposition βˆ’β†’ "it is preferred that" ","? weak_priority_operator ","? "that" (condition_operation | aggregate_clause) weak_optimization_operator (where_clause)? In particular, they start with the sentence "it is preferred that" and are always followed by a priority operator, i.e. a sentence expressing the level of relevance of the constraint with respect to other weak constraints (e.g. "with low priority") and either an aggregate clause or a condition operation, i.e., a sentence expressing operations between variables in the proposition (e.g. the sum of X and Y). The proposition is closed with an optimization operator, i.e., a sentence expressing the nature of the optimization (e.g. "is maximized") and an optional where clause. Definition propositions are as follows: 1 definition_proposition βˆ’β†’ (compounded_clause | constant_definition_clause | enumerative_definition_clause) In particular, a definition proposition can be one of the type (i) compounded_clause; (ii) constant_definition_clause; (iii) enumerative_definition_clause. Sentences of type (i) are used to define elements using lists and ranges; sentences of type (ii) are used to specify constants; sentences of type (iii) are used to define elements one at a time, optionally closing the proposition with a "when" clause, defining a condition in which the element is defined (e.g. X is true when Y is true), and with a where clause. Quantified choice propositions are as follows: 1 quantified_choice_proposition βˆ’β†’ quantifier subject_clause "can" CNL_COPULA? (verb_name | verb_name_with_preposition) ( quantified_exact_quantity_clause | quantified_range_clause)? quantified_object_clause? foreach_clause? In particular, they start with a quantifier, and are always followed by a subject and a verb, optionally connected by a CNL_COPULA (e.g. is, is a, is an, ...) and then, also optionally, a sentence of either type (i) quantified_exact_quantity_clause; or (ii) quantified_range_clause. Sentences of type (i) are used to express the quantity in question in exact terms (e.g. exactly 1); sentences of type (ii) are used to express it using a range (e.g. between 1 and 2). The proposition can be closed with an object clause, i.e. a sentence used to express an object for the proposition, in a subject verb object fashion, or a simple clause and, finally, a foreach clause, i.e. a sentence used to express additional objects for which any possible value can be tried. 3.2. Definition propositions Definition propositions are used to define concepts that can, then, be used by other propositions. These definitions express the signature of the concept indicated by the subject of the proposition, carrying information regarding the concept in the definition that our system can use later on in the specification whenever the same concept is used. 192 Carmine Dodaro et al. CEUR Workshop Proceedings 188–201 There are three different types of constructions available for these kinds of propositions, namely constant definitions, compound definitions, and enumerative definitions. Constant definitions. They are used to introduce constants to be used later on in the specification. Example 3.1. MyConstant is equal to 10. (1) As we can see from proposition (1), the constant 10 is introduced with a equal to clause, and it is bound to the subject of the proposition. In the case of constant definitions, there are no translations to ASP available, because they are instead stored and substituted in the resulting program when the subject of the definition is used. Compound definitions. Such definitions are used to introduce a set of related concepts all at once, by making use of either ranges of numbers or lists. Example 3.2. A day goes from 1 to 365 and is made of hours that are made of (2) minutes. A drink is one of alcoholic, nonalcoholic and has color that is equal to (3) respectively blue, yellow. Proposition (2) is an example of a definition of a day using a range, identified by the construct goes from/to. Additionally, it can be noticed the presence of additional made of clauses, indicating what the preceding object, may it be the subject or an object complement, is made of. Proposition (3) tries to do the same thing with a drink using lists with a one of clause, where one can also specify additional attributes for each element of the list in a positional way using a respectively clause, and a list with the same number of elements of the list enumerating all the possible values that the subject of the proposition can have. The corresponding ASP code, in this case, is quite straightforward and is depicted below: 1 day(1..365). 2 drink(1, "alcoholic", "blue"). 3 drink(2, "nonalcoholic", "yellow"). where line 1 corresponds to proposition (2), while lines 2 and 3 are related to proposition (3). One can notice the absence of information regarding made of clauses in proposition (2), but this is because the system only adds the composition hierarchy to the signature and does not encode them into ASP. Also, list elements defined in proposition (3) carry around their position number with them, which turns out to be handy as a basic way to encode precedence relationships when the subject is not a number. Enumerative definitions. Enumerative definitions are used to introduce a property for a single concept or a relationship between a set of concepts. The peculiarity of this kind of propositions lies in the different translations into ASP as the clauses used within them change. 193 Carmine Dodaro et al. CEUR Workshop Proceedings 188–201 Example 3.3. John is a waiter. (4) Pub 1 is close to pub 2 and pub X, where X is equal to 3,4. (5) Waiter John works in pub 1. (6) Waiter W is working when waiter W serves a drink. (7) In propositions (4)–(7) the construction to define relationships or properties related to a particular subject is shown. Proposition (5), in particular, illustrates another feature of our CNL, i.e., where clauses, that are used in the example to define the values that the variable X can take. The previous definitions always hold true, whereas proposition (4) is a conditional definition, identified by a when clause, and so it holds true only if the statement introduced by the aforementioned clause is true. Note that John is not considered as a variable here since it contains a lower case letter. Indeed, our assumption is that variables contain only upper case letters, numbers or symbols. Below are the translations in ASP of these examples: 1 waiter(john). 2 close_to(1, 2, 3). close_to(1, 2, 4). 3 work_in(john, 1). 4 working(W) :- serve(W,_). Also here, the translation is quite intuitive. Propositions that express relationships or properties that always hold true become regular ASP facts. In particular, proposition (6) in translated in a similar manner to compound definitions with lists. Proposition (5), instead, represents something that follows by the truth of another statement and, as such, is translated into an ASP rule, with what’s inside the when clause as the body of the rule. 3.3. Quantified propositions Quantified propositions are used to define relationships or properties that can be true for a given set of selected concepts following a choice. Also these propositions define a signature for the concept upon which the choice has to be made. Quantified propositions are always introduced by the every quantifier and, since they express possibilities, always contain a can clause. Example 3.4. Every patron can drink in exactly 1 pub for each day. (8) Every waiter can serve a drink. (9) Every patron can drink with a patron next to patron X. (10) Proposition (8) shows how one can express an exact number of choices that can be made for the concept expressed by the subject, and also how other concepts can be used in tandem with the subject to create a sort of cartesian product of choices, using a for each clause. These last constructions are optional, as shown in proposition (9). Proposition (10), instead, shows how one can constrain the choices that can be made by means of an auxiliary statement that defines the set of concepts from which the set of possible choices can be constructed. Their full translations into ASP is shown below: 194 Carmine Dodaro et al. CEUR Workshop Proceedings 188–201 {drink_in(_X1,_X2,_X3): pub(_X2)} = 1 :- patron(_X1), day(_X3). {serve(_X1,_X2): drink(_X2)} :- waiter(_X1). {drink_with(X,_X1): next_to(X,_X1)} :- patron(X). These translations present a couple of novelties. The first one is the introduction of choice rules with bounds, that are the ASP constructs that make it possible to represent propositions of this type. The second one is the introduction of generated variables (starting with symbol _) that are used wherever two atoms have to be bound and no variable to use has been found in the specification given in input. This feature enables the specification writer to avoid cluttering the document with unnecessary variables, as can be seen throughout the propositions, with the only limitation that anaphoras have to be expressed explicitly by providing the correct variable. 3.4. Strong Constraint propositions Strong Constraint propositions are used to define assertions that must be true for a given set of selected concepts. This kind of propositions does not introduce new signatures but, on the contrary, it consumes other signatures that were previously defined, meaning that the concepts used inside these are to be defined before they are used. A strong constraint can represent either a prohibition (It is prohibited...) or a requirement (It is required...), in which case one must specify a when clause, containing a premise, and a then clause, containing a consequence. In addition to simple clauses, that are made of subject, a verb, and related object clauses, one can also specify aggregate clauses, either in active or passive form, that define an aggregation of the set of concepts that satisfy the statement in their body with the operator that was specified (number or total). Example 3.5. It is prohibited that waiter X works with waiter W and also waiter X (11) works with a waiter before W, where W is between Jack and Mary. It is required that the number of occurrences between each 5 days (12) where a waiter works is at least 2. It is required that when a waiter works for 2 consecutive days then (13) the next day does not work. It is prohibited that a waiter does not work in a day and also the (14) previous 2 consecutive days does not work. It is required that when waiter X works in pub P1 then waiter X does (15) not work in pub P2, where P1 is different from P2. It is required that every waiter is payed. (16) Proposition (11)-(14) show practical examples of how lists can be queried for sequence rela- tionships between its elements. It is possible to check whether elements come before or after a specified element, take "windows" of a fixed number of elements or take consecutive elements, also from a specified one. Proposition (11) and proposition (15) show the other two features supported by where clauses, making it possible to express bounds or equality constraints between variables. In addition to that, arithmetic operations are supported by the use of an equal to clause. Lastly, proposition (16) is an example of how to specify a requirement that must hold for all the elements present in a particular set of concepts. The translations of the 195 Carmine Dodaro et al. CEUR Workshop Proceedings 188–201 propositions above are the following: 1 :- waiter(_X1, W), waiter(_X2, _X3), work_with(X, W), work_with(X,_X3), _X3 < W, _X1 >= 1, _X1 <= 3, _X2 >= 1, _X2 <= 5. 2 :- waiter(_X1), day(_X2), _X2 <= 352, #count{_X3: work(_X1,_X3), _X3 >= _X2, _X3 <= _X2+4} < 2. 3 :- day(_X1), #count{_X3: work(_X2,_X3), _X3 >= _X1, _X3 <= _X1+1} = 2, not work(_X2,_X1+2), waiter(_X2). 4 :- work(_X1,_X2), day(_X3), #count{_X2: work(_X1,_X2), _X2 >= _X3-2, _X2 <= _X3-1} = 2. 5 :- work_in(_X1, P1), work_in(_X1, P2), P1 != P2. 6 :- not payed(_X1), waiter(_X1). The translations above show that list element operations and checks are reflected on the indices of the elements in the list, or the numbers directly when there is a list. This behaviour is shown in the translation from proposition (11) and proposition (12). Proposition (13) and proposition (14), instead, show how check for consecutive elements in list are translated, as an existence check of the exact number of elements that have to satisfy the statement inside the aggregate, where the actual statement expressed in the proposition is expressed. On the other hand, proposition (15) and proposition (16) translations are not very complex, and show how different clauses referring to the same or different subject are linked in the resulting program, as well as how quantifiers inside constraints are dealt with. 3.5. Weak Constraint propositions Weak constraint propositions are used to define assertions that are preferably true for a given set of selected concepts. Also this type of propositions consume signatures from previously defined concepts. They are always introduced by It is preferred and need the specification of the optimization objective (either minimization or maximization) and the level of priority of the optimization (low, medium or high). Example 3.6. It is preferred, with high priority, that the difference in absolute (17) value between 1000 and and the total of items in a box where a drink is stocked ranging between 10 and 5000 is minimized. It is preferred with low priority that the number of drinks that are (18) served is maximized. Proposition (17) displays a weak constraint that represent a minimization objective. Here, the value to optimize is obtained by means of an arithmetic operation (showing here that also the absolute value operator for those is supported) between a constant number and the result of an aggregate clause, also with the definition of the range of possible values where the result of this arithmetic operation lies (useful for limiting the search space of the solutions). In proposition (18) there is an example of a maximization constraint, showing that arithmetic operations are optional and even just the results of the aggregate clauses can be used as values to be optimized. Translation of the propositions above are shown below: 1 :∼ drink(_X1), #sum{_X2,_X3: stocked(_X1,_X3,_X2)} = _X4, _X4 >= 10, _X4 <= 5000, _X5 = |1000-_X4|. [_X5@1, _X1] 196 Carmine Dodaro et al. CEUR Workshop Proceedings 188–201 2 :∼ #count{_X1: served(_X1)} = _X2. [-_X2@3] The translations of proposition (17) and proposition (18) are very linear, starting from the aggregate for the first value, and then, if necessary, putting bounds on it or extracting another result, that is, the one to be optimized. Remarkably, priority levels are transformed into numerical level constants, in descending order from the least severe to the most, and maximization constraints are just translated as flipping the sign of the value to be optimized. Lastly, it is worth mentioning that the difference in the translation of the aggregate from proposition (18) with respect to other translations seen in previous examples is due to the fact that this is a special case of the active aggregate form, where a property is in verb position. 4. Use cases In this section we present some examples to demonstrate how the language can be used to define well-known problems in a natural an easily understandable way. The corresponding translations into ASP are also provided. 4.1. Graph coloring We begin by presenting an encoding of the graph coloring problem using our controlled natural language: 1 A node goes from 1 to 3. 2 A color is one of red, green, blue. 3 Node 1 is connected to node X, where X is one of 2, 3. 4 Node 2 is connected to node X, where X is one of 1, 3. 5 Node 3 is connected to node X, where X is one of 1, 2. 6 Every node can be assigned to exactly 1 color. 7 It is required that when node X is connected to node Y then node X is not assigned to color C and also node Y is not assigned to color C. One can notice the presence of a ranged definition proposition (line 1) and a list definition proposition (line 2), enumerative definition propositions with where clauses (lines 3–5), a quantified clause (line 6) and, lastly, a positive strong constraint (line 7). The resulting ASP encoding is the following: 1 node(1..3). 2 color(1,"red"). color(2,"green"). color(3,"blue"). 3 connected_to(1,2). connected_to(1,3). 4 connected_to(2,1). connected_to(2,3). 5 connected_to(3,1). connected_to(3,2). 6 {assigned_to(_X1,_X2): color(_,_X2)} = 1 :- node(_X1). 7 :- connected_to(X,Y), assigned_to(X,C), assigned_to(Y,C). where each proposition at line 𝑖 is translated as the rule(s) reported at line 𝑖 (with 𝑖 = [1..7]). For the sake of completeness, we report a comparison with the CNL for specifying the graph coloring problem used by PENGASP (Figure 5 of [10]). 1 The node 1 is connected to the nodes 2 and 3. 197 Carmine Dodaro et al. CEUR Workshop Proceedings 188–201 2 The node 2 is connected to the nodes 1 and 3. 3 The node 3 is connected to the nodes 1 and 2. 4 Red is a colour. Green is a colour. Blue is a colour. 5 Every node is assigned to exactly one colour. 6 It is not the case that a node X is assigned to a colour and a node Y is assigned to the colour and the node X is connected to the node Y. There are two main differences. The first one is that our CNL must use variables (i.e., X in our example) also to specify the connections, whereas the one of PENGASP does not need it. In our case, sentence at line 1 would create the atom connected_to(1,2,3). Secondly, the last sentence is expressed in a negative form in case of PENGASP , which is similar to the concept of constraint in ASP, whereas our CNL uses a positive sentence which is similar to the concept of clause in propositional logic. 4.2. Hamiltonian path The following statements represent an encoding in our CNL for the Hamiltonian path problem: 1 A node goes from 1 to 5. 2 Node 1 is connected to node X, where X is one of 2, 3. 3 Node 2 is connected to node X, where X is one of 1, 4. 4 Node 3 is connected to node X, where X is one of 1, 4. 5 Node 4 is connected to node X, where X is one of 3, 5. 6 Node 5 is connected to node X, where X is one of 3, 4. 7 Every node X can have a path to a node connected to node X. 8 It is required that the number of nodes where node X has a path to is equal to 1. 9 It is required that the number of nodes that have a path to node X is equal to 1. 10 Node Y is reachable when node X is reachable and also node X has a path to node Y. 11 It is required that every node is reachable. 12 Start is equal to 1. 13 Node start is reachable. Line 1 defines the nodes and lines from 2 to 6 define the connections between nodes. Then, line 7 reports a quantified proposition with an object accompanied by a verb clause, lines 8 and 9 report strong constraint propositions with aggregates, line 10 reports a conditional definition clause, line 11 reports a constraint clause with the presence of a quantifier, and line 12 define the constant start, which is subsequently used in line 13. The ASP encoding corresponding to the CNL statements is the following: 1 node(1..5). 2 connected_to(1,2). connected_to(1,3). 3 connected_to(2,1). connected_to(2,4). 4 connected_to(3,1). connected_to(3,4). 5 connected_to(4,3). connected_to(4,5). 6 connected_to(5,3). connected_to(5,4). 7 {path_to(X,_X1): connected_to(X,_X1)} :- node(X). 8 :- node(X), #count{_X2: path_to(X,_X2)} != 1. 198 Carmine Dodaro et al. CEUR Workshop Proceedings 188–201 9 :- node(X), #count{_X3: path_to(_X3,X)} != 1. 10 reachable(Y) :- reachable(X), path_to(X,Y). 11 :- not reachable(_X4), node(_X4). 12 reachable(1). where a CNL statement at line 𝑖 is represented by the rule(s) at line 𝑖 with (𝑖 = [1..11]), whereas CNL statements reported in lines 12 and 13 are encoded by the rule at line 12. 4.3. Maximal clique The following statements represent the encoding in our CNL of the maximum clique problem: 1 A node goes from 1 to 5. 2 Node 1 is connected to node X, where X is one of 2, 3. 3 Node 2 is connected to node X, where X is one of 1, 3, 4, 5. 4 Node 3 is connected to node X, where X is one of 1, 2, 4, 5. 5 Node 4 is connected to node X, where X is one of 2, 3, 5. 6 Node 5 is connected to node X, where X is one of 2, 3, 4. 7 Every node can be chosen. 8 It is required that when node X is not connected to node Y then node X is not chosen and also node Y is not chosen, where X is different from Y. 9 It is preferred with high priority that the number of nodes that are chosen is maximized. where statements from line 1 to line 6 define the input graph. Then, line 7 reports a quantified proposition with no object, line 8 contains a strong constraint proposition with a comparison on the variables used inside it, and line 9 reports a weak constraint expressing a maximiza- tion preference on the highest priority level. The resulting ASP encoding is reported in the following: 1 node(1..5). 2 connected_to(1,2). connected_to(1,3). 3 connected_to(2,1). connected_to(2,3). connected_to(2,4). connected_to(2,5). 4 connected_to(3,1). connected_to(3,2). connected_to(3,4). connected_to(3,5). 5 connected_to(4,2). connected_to(4,3). connected_to(4,5). 6 connected_to(5,2). connected_to(5,3). connected_to(5,4). 7 {chosen(_X1)} :- node(_X1). 8 :- not connected_to(X,Y), chosen(X), chosen(Y), X != Y. 9 :∼ #count{_X1: chosen(_X1)} = _X2. [-_X2@1] where each CNL proposition at line 𝑖 is translated as the rule(s) reported at line 𝑖 (with 𝑖 = [1..9]). 5. Conclusion In this paper we proposed a system for automatically translating English sentences expressed using a controlled language into ASP rules. Moreover, we provided several examples of com- binatorial problems that can be specified in a clear and easy way, even by a non-ASP expert. using our CNL and their translation as ASP rules. Concerning future work, our CNL might be extended to cover additional constructs and language extensions, e.g. temporal operators [15] 199 Carmine Dodaro et al. CEUR Workshop Proceedings 188–201 or Constraint ASP (CASP) [16, 17], and the tool can be extended to implement a bidirectional conversion, where ASP rules are translated into CNL statements. Finally, we recall that the tool presented in this paper is open source and available at https://github.com/dodaro/cnl2asp. References [1] G. Brewka, T. Eiter, M. Truszczynski, Answer set programming at a glance, Commun. ACM 54 (2011) 92–103. doi:10.1145/2043174.2043195. [2] E. Erdem, M. Gelfond, N. Leone, Applications of answer set programming, AI Mag. 37 (2016) 53–68. doi:10.1609/aimag.v37i3.2678. [3] M. Gebser, R. Kaminski, B. Kaufmann, M. Ostrowski, T. Schaub, P. Wanko, Theory solving made easy with clingo 5, in: Technical Communications of ICLP, volume 52 of OASIcs, Schloss Dagstuhl - Leibniz-Zentrum fΓΌr Informatik, 2016, pp. 2:1–2:15. doi:10.4230/ OASIcs.ICLP.2016.2. [4] M. Alviano, F. Calimeri, C. Dodaro, D. FuscΓ , N. Leone, S. Perri, F. Ricca, P. Veltri, J. Zan- gari, The ASP system DLV2, in: Proceedings of LPNMR, volume 10377, Springer, 2017, pp. 215–221. URL: https://doi.org/10.1007/978-3-319-61660-5_19. doi:10.1007/ 978-3-319-61660-5\_19. [5] T. Kuhn, A survey and classification of controlled natural languages, Comput. Linguistics 40 (2014) 121–170. URL: https://doi.org/10.1162/COLI_a_00168. doi:10.1162/COLI\_a\ _00168. [6] I. S. Bajwa, M. G. Lee, B. Bordbar, SBVR business rules generation from natural language specification, in: AI for Business Agility, AAAI, 2011. URL: http://www.aaai.org/ocs/index. php/SSS/SSS11/paper/view/2378. [7] The Business Rules Group, Defining business rules ∼ what are they really?, 2000. URL: http://www.businessrulesgroup.org/first_paper/BRG-whatisBR_3ed.pdf. [8] C. Baral, J. Dzifcak, Solving puzzles described in english by automated translation to answer set programming and learning how to do that translation, in: Proceedings of KR, AAAI Press, 2012. [9] E. Erdem, R. Yeniterzi, Transforming controlled natural language biomedical queries into answer set programs, in: Proceedings of the BioNLP Workshop, Association for Computational Linguistics, 2009, pp. 117–124. [10] R. Schwitter, Specifying and verbalising answer set programs in controlled nat- ural language, Theory Pract. Log. Program. 18 (2018) 691–705. doi:10.1017/ S1471068418000327. [11] F. Calimeri, W. Faber, M. Gebser, G. Ianni, R. Kaminski, T. Krennwallner, N. Leone, M. Maratea, F. Ricca, T. Schaub, Asp-core-2 input language format, Theory Pract. Log. Program. 20 (2020) 294–309. doi:10.1017/S1471068419000450. [12] W. Faber, G. Pfeifer, N. Leone, Semantics and complexity of recursive aggregates in answer set programming, Artif. Intell. 175 (2011) 278–298. doi:10.1016/j.artint.2010.04. 002. [13] F. Buccafurri, N. Leone, P. Rullo, Enhancing disjunctive datalog by constraints, IEEE Trans. Knowl. Data Eng. 12 (2000) 845–860. doi:10.1109/69.877512. 200 Carmine Dodaro et al. CEUR Workshop Proceedings 188–201 [14] C. Dodaro, M. Maratea, F. Riccio, Grammar of the cnl, 2022. URL: https://github.com/ dodaro/cnl2asp/blob/main/src/cnl/grammar/cnl_grammar.lark. [15] P. Cabalar, M. DiΓ©guez, T. Schaub, A. Schuhmann, Towards metric temporal answer set programming, Theory Pract. Log. Program. 20 (2020) 783–798. URL: https://doi.org/10. 1017/S1471068420000307. doi:10.1017/S1471068420000307. [16] M. Balduccini, Industrial-size scheduling with ASP+CP, in: Proceedings of LPNMR, volume 6645 of LNCS, Springer, 2011, pp. 284–296. URL: https://doi.org/10.1007/978-3-642-20895-9_ 33. doi:10.1007/978-3-642-20895-9\_33. [17] M. Banbara, B. Kaufmann, M. Ostrowski, T. Schaub, Clingcon: The next generation, Theory Pract. Log. Program. 17 (2017) 408–461. URL: https://doi.org/10.1017/S1471068417000138. doi:10.1017/S1471068417000138. 201