Answer Set Programming and Large Language Models
                         interaction with YAML: Second Report
                         Mario Alviano, Lorenzo Grillo, Fabrizio Lo Scudo and Luis Angel Rodriguez Reiners*
                         DeMaCS, University of Calabria, 87036 Rende (CS), Italy


                                     Abstract
                                     Answer Set Programming (ASP) and Large Language Models (LLMs) have emerged as powerful tools in Artificial
                                     Intelligence, each offering unique capabilities in knowledge representation and natural language understanding,
                                     respectively. In this paper, we combine the strengths of the two paradigms to couple the reasoning capabilities of
                                     ASP with the attractive natural language processing capability of LLMs. We introduce a YAML-based format
                                     for specifying prompts, allowing users to configure the behavior of the system and to encode domain-specific
                                     background knowledge. Input prompts are processed by LLMs to generate relational facts, which are then
                                     processed by ASP rules for knowledge reasoning, and finally the ASP output is mapped back to natural language
                                     by LLMs, so to provide a captivating user experience.

                                     Keywords
                                     Answer Set Programming, Large Language Models, Knowledge Representation, Natural Language Generation


                         1. Introduction
                         Large Language Models (LLMs) and Answer Set Programming (ASP) represent two distinct but com-
                         plementary paradigms in Artificial Intelligence (AI). LLMs, such as GPT [1], PaLM [2], and LLaMa [3],
                         have transformed natural language processing (NLP) by achieving unprecedented levels of fluency and
                         understanding in textual data. In contrast, ASP [4, 5], a declarative programming paradigm rooted in
                         logic programming under answer set semantics [6], excels in knowledge representation and logical
                         reasoning, making it fundamental for AI systems that require robust inference capabilities. Individually,
                         LLMs and ASP offer unique advantages within their respective domains. LLMs excel at various NLP
                         tasks [7, 8], such as language generation, summarization, and sentiment analysis, utilizing deep learning
                         and extensive pre-trained language models. In contrast, ASP equips AI systems with robust reasoning
                         capabilities, enabling them to process complex knowledge bases, draw logical conclusions, and tackle
                         intricate combinatorial problems. This makes ASP particularly effective in decision-making scenarios,
                         including planning and scheduling [9, 10], as well as diagnosis and configuration [11, 12]. Recognizing
                         the complementary strengths of LLMs’ linguistic abilities and ASP’s reasoning capabilities, this paper
                         proposes an approach that leverages the synergies between these two paradigms, inspired by recent
                         works in the literature [13, 14]. Our objective is to develop a unified system that seamlessly integrates
                         natural language understanding with logical inference, allowing AI applications to adeptly handle the
                         complex interplay between textual data and logical structures.
                            In this paper, we propose a comprehensive framework that combines LLMs and ASP, exploiting the
                         strengths of both paradigms to mitigate their respective weaknesses. We outline a method for encoding
                         specific domain knowledge into input prompts using a YAML-based format, which allows LLMs to
                         produce relational facts that are utilized by ASP for reasoning. The reasoned outcomes of ASP are then
                         translated back into natural language by LLMs, creating an engaging user experience and improving the
                         clarity of the results. An overview of the main pipeline addressed by our system is shown in Figure 1.

                         Workshop on Symbolic and Neuro-Symbolic Architectures for Intelligent Robotics Technology (SYNERGY) co-located with the
                         21st International Conference on Principles of Knowledge Representation and Reasoning (KR2024), November 2–8, 2024, Hanoi,
                         Vietnam.
                         *
                           Corresponding author.
                         $ mario.alviano@unical.it (M. Alviano); luis.reiners@unical.it (L. A. Rodriguez Reiners)
                          https://alviano.net/ (M. Alviano)
                          0000-0002-2052-2063 (M. Alviano); 0009-0000-1808-9910 (L. A. Rodriguez Reiners)
                                  © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
Figure 1: Graphical representation of the LLMASP pipeline. The system receives two YAML files—one describing
system behavior and the other related to the specific application being developed—together with a relational
database and a user query in natural language. In this setup, the behavior YAML file offers initial prompts for
the LLM, which are then filled in with details from the application YAML file and the user’s input. The LLM uses
these finalized prompts to extract information from the user input, converting it into factual representations.
These facts are integrated with an ASP program (knowledge base) and processed by an ASP solver to produce an
answer set. Finally, the initial prompts are once again utilized to translate the returned answer set back into
natural language, delivering a coherent (and easier to read) response to the user.


The system takes in input two YAML files, one defining the behavior of the system and one specific of
the application under development, a database (relational facts), and a request from the user expressed
in natural language. The behavior file contains partial prompts for the LLM, which are completed by
the data stored in the application file and the user input. The complete prompts ask to extract data
from the user input and represent them in the form of facts. Facts are then combined with a knowledge
base (an ASP program) an processed by an ASP solver to obtain an answer set. The answer set is then
combined with the prompts obtained by the two YAML files to produce an answer in natural language
for the user.
   As a possible application of the technology we want to develop, imagine a bustling online marketplace
that introduces a new, innovative way for users to interact with its platform: a chatbot. Traditionally,
users navigate the website using a mouse and keyboard to add or remove items from their shopping
basket. However, this marketplace now offers an alternative that caters to the growing number of
mobile phone users who prefer a more intuitive interaction. Instead of manually clicking through
menus, users can simply converse with the chatbot using natural language to achieve the same results.
For instance, a user might say, “Add two red t-shirts to my cart” or “Remove the coffee maker from
my basket,” and the chatbot seamlessly executes these commands. This voice-to-text interaction is
particularly convenient for mobile users who find typing cumbersome or impractical on smaller screens.
By leveraging the chatbot, the marketplace not only enhances the shopping experience but also makes
it more accessible and user-friendly for everyone.
   In summary, our prototype system facilitates a smooth interaction between an LLM and an ASP
solver. When an input text is received, the system interacts with the LLM using a set of predefined
prompts whose overall structure is enhanced with additional sentences, as described in the YAML
format. The LLM-generated responses are then processed to extract factual data, which is fed into
the ASP solver. The ASP solver performs logical reasoning on the combination of input data and the
specified knowledge base, producing an answer set that encapsulates reasoned conclusions and insights.
To translate the logical output into comprehensible natural language, the system re-engages the LLM,
using predefined prompts and enriched specifications to convert the answer set into coherent sentences.
Lastly, the sentences are summarized by the LLM, offering users a concise and insightful overview of
the derived conclusions.


2. Background
2.1. Large Language Models
Large Language Models (LLMs) are sophisticated artificial intelligence systems designed to understand
and generate human-like text. These models are typically based on deep learning architectures, such as
Transformers, and are trained on vast amounts of text data to learn complex patterns and structures of
language. In this article, LLMs are used as black box operators on text (functions that take text in input
and produce text in output). At each interaction with a LLM, the generated text is influenced by all
previously processed text, and randomness is involved in the process. The text in input is called prompt,
and the text in output is called generated text or response.
Example 1. Let us consider the following prompt:
      Encode as Datalog facts the following sentences: I would like some cooking ideas with
      apples as dessert, and something with meat as a main plate.
A response produced by Meta Llama 3 is reported below.
want(dessert, apples). want(main_plate, meat).
dish_type(apples, dessert). dish_type(meat, main_plate).

It is a very good starting point, but the LLM must be instructed on a specific format to use in encoding
facts. We aim at gaining more control on the output produced by the LLM.                              ■

2.2. Answer Set Programming
All sets and sequences considered in this paper are finite. Let P, C, V be fixed nonempty sets of
predicate names, constants and variables. Predicates are associated with an arity, a non-negative integer.
A term is any element in C ∪ V. An atom is of the form 𝑝(𝑡), where 𝑝 ∈ P, and 𝑡 is a possibly empty
sequence of terms. A literal is an atom possibly preceded by the default negation symbol not; they are
referred to as positive and negative literals. An aggregate is of the form
                                        #sum{𝑡𝑎 , 𝑡′ : 𝑝(𝑡)} ⊙ 𝑡𝑔                                      (1)
where ⊙ ∈ {<, ≤, ≥, >, =, ̸=} is a binary comparison operator, 𝑝 ∈ P, 𝑡 and 𝑡′ are possibly empty
sequences of terms, and 𝑡𝑎 and 𝑡𝑔 are terms. Let #count{𝑡′ : 𝑝(𝑡)} ⊙ 𝑡𝑔 be syntactic sugar for
#sum{1, 𝑡′ : 𝑝(𝑡)} ⊙ 𝑡𝑔 . A choice is of the form
                                          𝑡1 ≤ {atoms} ≤ 𝑡2                                            (2)
where atoms is a possibly empty sequence of atoms, and 𝑡1 , 𝑡2 are terms. Let ⊥ be syntactic sugar for
1 ≤ {} ≤ 1. A rule is of the form
                                        head :– body.                                               (3)
where head is an atom or a choice, and body is a possibly empty sequence of literals and aggregates.
(Symbol :– is omitted if body is empty. The head is usually omitted if it is ⊥, and the rule is called
constraint.) For a rule 𝑟, let 𝐻(𝑟) denote the atom or choice in the head of 𝑟; let 𝐵 Σ (𝑟), 𝐵 + (𝑟) and
𝐵 − (𝑟) denote the sets of aggregates, positive and negative literals in the body of 𝑟; let 𝐵(𝑟) denote the
set 𝐵 Σ (𝑟) ∪ 𝐵 + (𝑟) ∪ 𝐵 − (𝑟).
Example 2. Let us consider the following rules:
     index(1). index(2). index(3). succ(1,2). succ(2,3). empty(2,2).
     grid(X,Y) :- index(X), index(Y).
     0 <= {assign(X,Y)} <= 1 :- grid(X,Y), not empty(X,Y).
     :- #count{A,B : assign(A,B)} != 5.
     :- assign(X,Y), succ(X,X'), assign(X',Y ).
     :- assign(X,Y), succ(Y,Y'), assign(X ,Y').

The first line above comprises rules with atomic heads and empty bodies (also called facts). After
that there is a rule with atomic head and nonempty body (also called a definition), followed by a
choice rule, and three constraints. If 𝑟 is the choice rule, then 𝐻(𝑟) = {0 <= {assign(X, Y)} <= 1},
𝐵 + (𝑟) = {grid(X, Y)}, 𝐵 − (𝑟) = {not empty(X, Y)}, and 𝐵 Σ (𝑟) = ∅.                             ■
  A variable 𝑋 occurring in 𝐵 + (𝑟) is a global variable. Other variables occurring among the terms 𝑡 of
some aggregate in 𝐵 Σ (𝑟) of the form (1) are local variables. And any other variable occurring in 𝑟 is
an unsafe variable. A safe rule is a rule with no unsafe variables. A program Π is a set of safe rules. A
substitution 𝜎 is a partial function from variables to constants; the application of 𝜎 to an expression
𝐸 is denoted by 𝐸𝜎. Let instantiate(Π) be the program obtained from rules of Π by substituting
global variables with constants in C, in all possible ways; note that local variables are still present in
instantiate(Π). The Herbrand base of Π, denoted base(Π), is the set of ground atoms (i.e., atoms with
no variables) occurring in instantiate(Π).
Example 3. Variables 𝐴, 𝐵 are local, and all other variables are global. Let Π be the program comprising
all rules in Example 2 (which are safe). If C = {1, 2, 3}, then instantiate(Π) comprises, among others,
the following rules:
     grid(1,1) :- index(1), index(1).
     0 <= {assign(1,1)} <= 1 :- grid(1,1), not empty(1,1).
     :- #count{A,B : assign(A,B)} != 5.

Note that the local variables 𝐴, 𝐵 are still present in the last rule above.                                  ■
   An interpretation is a set of ground atoms. For an interpretation 𝐼, relation 𝐼 |= · is defined as follows:
for a ground atom 𝑝(𝑐), 𝐼 |= 𝑝(𝑐) if 𝑝(𝑐) ∈ 𝐼, and 𝐼 |= not 𝑝(𝑐) if 𝑝(𝑐) ∈          / 𝐼; for an aggregate 𝛼 of
the form (1), the aggregate set∑︀of 𝛼 w.r.t. 𝐼, denoted aggset(𝛼, 𝐼), is {⟨𝑡𝑎 , 𝑡′ ⟩𝜎 | 𝑝(𝑡)𝜎 ∈ 𝐼, for some
substitution 𝜎}, and 𝐼 |= 𝛼 if ( ⟨𝑐𝑎 ,𝑐′ ⟩∈aggset(𝛼,𝐼) 𝑐𝑎 ) ⊙ 𝑡𝑔 is a true expression over integers; for a choice
𝛼 of the form (2), 𝐼 |= 𝛼 if 𝑡1 ≤ |𝐼 ∩ atoms| ≤ 𝑡2 is a true expression over integers; for a rule 𝑟 with
no global variables, 𝐼 |= 𝐵(𝑟) if 𝐼 |= 𝛼 for all 𝛼 ∈ 𝐵(𝑟), and 𝐼 |= 𝑟 if 𝐼 |= 𝐻(𝑟) whenever 𝐼 |= 𝐵(𝑟);
for a program Π, 𝐼 |= Π if 𝐼 |= 𝑟 for all 𝑟 ∈ instantiate(Π).
   For a rule 𝑟 of the form (3) and an interpretation 𝐼, let expand (𝑟, 𝐼) be the set {𝑝(𝑐) :– body. |
𝑝(𝑐) ∈ 𝐼 occurs in 𝐻(𝑟)}. The reduct of Π w.r.t. 𝐼 is the program comprising the
expanded
⋃︀            rules of instantiate(Π) whose body is true w.r.t. 𝐼, that is, reduct(Π, 𝐼) :=
   𝑟∈instantiate(𝑃 𝑖), 𝐼|=𝐵(𝑟) expand (𝑟, 𝐼). An answer set of Π is an interpretation 𝐼 such that 𝐼 |= Π
and no 𝐽 ⊂ 𝐼 satisfies 𝐽 |= reduct(Π, 𝐼).
Example 4 (Continuing Example 3). Program Π has two answer sets, which comprise the following
instances of predicate assign/2:
Figure 2: The solutions encoded by the two answer sets of program Π in Example 2.


   1. assign(1,2), assign(2,1), assign(2,3), assign(3,2);
   2. assign(1,1), assign(1,3), assign(3,1), assign(3,3).
Figure 2 shows the solutions encoded by the two answer sets.                                              ■

  The language of ASP supports several other constructs, among them binary built-in relations (i.e., <,
<=, >=, >, ==, !=) which are interpreted naturally.


2.3. YAML
YAML (YAML Ain’t Markup Language; https://yaml.org/spec/1.2.2/) is a human-readable data serializa-
tion format commonly used for configuration files, data exchange, and representation of structured data.
YAML is designed to be easily readable by humans and is commonly used in software development
for configuration files, data storage, and data interchange between different programming languages.
YAML uses indentation to denote nesting and relies on simple syntax rules, such as key-value pairs and
lists, to represent structured data. YAML is often preferred for its simplicity, readability, and flexibility
compared to other serialization formats like JSON and XML. In this article, we focus on the following
restricted fragment: A scalar is any number or string (possibly quoted). A block sequence is a sequence
of entries, each one starting by a dash followed by a space. A mapping is a sequence of key-value pairs,
each pair using a colon and space as separator, where keys and values are scalars. A scalar can be
written in block notation using the prefix | (if not a key). Lines starting with # are comments.

Example 5. Below is a YAML document:
     name: LLMASP
     papers:
     - CILC 2024
     - SYNERGY 2024
     description: |
       LLMASP combines ASP and LLMs...
       LLMs are used to extract data...

It encodes a mapping with keys name, papers and description. Key name is associated with the scalar
LLMASP. Key papers is associated with the list [CILC 2024, SYNERGY 2024]. Key description is
associated with a scalar in block notation.                                                      ■


3. LLMASP Configuration
The interaction between LLMs and ASP is achieved by means of two YAML specification files defined
in this section. The first YAML file specifies global behavior settings for LLMASP, as tone, style and
general instructions for the LLM, while the second YAML file contains domain-specific guidelines, as a
description of the context and mappings between each ASP fact and its corresponding natural language
translation.
Behavior File. The behavior file comprises two keys, namely preprocessing and postprocessing.
The preprocessing section contains the following properties:
    • init, whose value is used to provide general instructions to the LLM;
    • context, whose value must include the string §context§, to be combined with contextual
      information regarding an application of interest;
    • mapping, whose value must include the strings §input§, §instructions§ and §atom§, to be
      combined with the instructions on how to extract specific atoms from the user input.
The postprocessing section is similar but the mapping must include the strings §facts§, §
instructions§ and §atom§, to be combined with the instructions on how to map specific atoms
(from the facts in the answer set) to natural language. Additionally, the postprocessing section specifies
how to summarize the §responses§ collected by the postprocessing mapping operation.
Example 6. Below is the behavior file encoding the hard-coded system presented at the 39th Italian
Conference on Computational Logic [15].
preprocessing:
  init: |
    You are a Natural Language to Datalog translator. To translate your
    input to Datalog, you will be asked a sequence of questions. The
    answers are inside the user input provided with
    [USER_INPUT]input[/USER_INPUT] and the format is provided with
    [ANSWER_FORMAT]predicate(terms).[/ANSWER_FORMAT]. Predicate is a
    lowercase string (possibly including underscores). Terms is a
    comma-separated list of either double quoted strings or integers.
    Be sure to control the number of terms in each answer!
    An answer MUST NOT be answered if it is not present in the user input.
    Remember these instructions and don't say anything!
  context: |
    Here is some context that you MUST analyze and remember.
    §context§
    Remember this context and don't say anything!
  mapping:
    [USER_INPUT]§input§[/USER_INPUT]
    §instructions§
    [ANSWER_FORMAT]§atom§.[/ANSWER_FORMAT]

postprocessing:
  init: |
    You are now a Datalog to Natural Language translator.
    You will be given relational facts and mapping instructions.
    Relational facts are given in the form [FACTS]atoms[/FACTS].
    Remember these instructions and don't say anything!
  context: |
    Here is some context that you MUST analyze and remember.
    §context§
    Remember this context and don't say anything!
  mapping: |
    [FACTS]§facts§[/FACTS]
    Each fact matching §atom§ must be interpreted as follows:
    §instructions§
  summarize: |
    Summarize the following responses:
    §responses§


Application File. The second YAML file, the application file, contains three sections, namely
preprocessing, knowledge base, and postprocessing. The values associated with preprocessing
and postprocessing are mappings where keys are either atoms or the special value _ (used for providing
a context), and values are scalars. The value associated with knowledge base is an ASP program.

Example 7. Below is an example about an assistant for a marketplace aimed at helping customers with
fulfilling orders to match a given recipe.
preprocessing:
- _: The marketplace offers food products. Products and product preferences will
    be talked about.
- product_request("product"): List all the products mentioned or requested. If a
    product is named must be listed. Ignore plural, always write the product name
    in singular.
- product_request("product", quantity): List all the products mentioned or
    requested if and only if they have a quantity associated. Ignore plural,
    always write the product name in singular.

knowledge base: |
  #script(python)
  def min(a, b): return a if a < b else b
  #end.

  % guess selection of products
  {select(P,W,Q',T) : Q' = 1..@min(Q,R), T=Q'*PP+C} <= 1 :-
    product_request(P,R), product_price(P,W,PP), warehouse(W),
    warehouse_shipping_cost(W,C), product_in_warehouse(P,W,Q).
  :- product_request(P,R), #sum{Q,W : select(P,W,Q,_)} != R.

  % minimize total cost considering price and shipping
  :∼ select(P, W, Q, T).[T@1]

  % minimize shipping cost
  :∼ warehouse_shipping_cost(W,C), warehouse_free_shipping(W,T),
    select(_,W,Q,_), Q > 0,
    #sum{Q' * Price,P : select(P,W,Q',_), product_price(P,W,Price)} < T.
    [C@3, W]

  %guess selection of products for a recipe
  {select_for_recipe(R,P,W,Q',PP') : Q' = 1..@min(Q,A), PP'=PP*Q'} <= 1 :-
    product_request(P), recipe(R), recipe_ingredient(R,P,A),
    product_in_warehouse(P,W,Q), product_price(P, W, PP), warehouse(W).
  :- product_request(P), #sum{Q,R,W : select_for_recipe(R,P,W,Q,_);
                               -A,R : recipe_ingredient(R, P, A) } != 0.

  {matching_recipe(R,P)}=1 :- product_request(P), recipe_ingredient(R,P,_).

  #show select/4.
  #show select_for_recipe/5.
  #show matching_recipe/2.

postprocessing:
- _: You are an assistant in an online marketplace, which is talking directly to a
    customer. Your priority is to make sales, so the main goal of all responses
    must be to make a sale. Your answers must be customer-oriented. Do not mention
    any product that is not explicitly provided to you before. Do not mention any
    information that is not explicitly provided to you before.
- select_for_recipe("recipe", "product", "warehouse", "quantity", "total"): Say to
    buy "quantity" of "product" for "total" including shipping costs from
    "warehouse" if the customer desire to make "recipe". Do not forget the
    quantity.
- select("product", "warehouse", "quantity", "total"): Suggest to select
    "quantity" of the "product" for a cost of "total" from the "warehouse". This
    price include the shipping costs.
- matching_recipe("recipe", "product"): Suggest that "recipe" can be done with
    "product".

The preprocessing aims at extracting data or a query about products. The data is combined with the
knowledge base and a database containing all information regarding product costs, for example
  product("apple").
  warehouse_shipping_cost("Lattanzi Warehouse", 7).
  warehouse_shipping_cost("Verza Warehouse", 5).
  warehouse_shipping_fee("Lattanzi Warehouse", 5).
  warehouse_shipping_fee("Verza Warehouse", 3).
  product_in_warehouse("apple", "Lattanzi Warehouse", 20).
  product_in_warehouse("apple", "Verza Warehouse", 3).
  product_price("apple", "Lattanzi Warehouse", 2).
  product_price("apple", "Verza Warehouse", 4).
  recipe_ingredient("Apple Pie", "apple", 4).

The combined knowledge is used to identify bests buying offers and products matching a recipe. Finally,
the postprocessing prompts aim at producing a paragraph reporting the found results, possible products
for adding to a shopping cart with related costs.                                                   ■

Notation. For 𝑎, 𝑏, 𝑐 being strings, let 𝑎[𝑏 ↦→ 𝑐] denote the string obtained from 𝑎 by replacing
all occurrences of 𝑏 with 𝑐. Given a behavior file 𝐵, let pre 𝐵 (𝛼) be the value associated with 𝛼
in the preprocessing mapping of 𝐵, where 𝛼 is among init, context and mapping. Similarly, let
post 𝐵 (𝛼) be the value associated with 𝛼 in the postprocessing mapping of 𝐵, where 𝛼 is among
init, context and mapping For a YAML application file 𝐴, let pre 𝐴 (𝛼) be the value associated with 𝛼
in the preprocessing mapping, where 𝛼 is either an atom or _. Similarly, let post 𝐴 (𝛼) be the value
associated with 𝛼 in the postprocessing mapping, where 𝛼 is either an atom or _. Finally, let kb 𝐴 be
the ASP program in background knowledge.


4. LLMASP Pipeline
The architecture of LLMASP is shown in Figure 1. In this section, we discuss its core components and
illustrate its potential through a practical example.
   LLMASP takes in input the two YAML file introduced in Section 3, namely the behavior file 𝐵 and
the application file 𝐴, together with a database file 𝐷 (comprising facts) and a request text 𝑇 (expressed
by the user in natural language). The system undergoes a sequence of interactions with an LLM to
populate a set 𝐹 of facts and a set 𝑅 of responses, which are initially empty. Specifically, LLMASP
execute the following procedure:
 P1. The LLM is invoked with the prompt pre 𝐵 (init)
 P2. If pre 𝐴 (_) is defined, the LLM is invoked with the following prompt:
     pre 𝐵 (context)[§context§ ↦→ pre 𝐴 (_)].
 P3. For each atom 𝛼 such that pre 𝐴 (𝛼) is defined, the LLM is invoked with the prompt
     pre 𝐵 (mapping)[§input§ ↦→ 𝑇 ][§atom§ ↦→ 𝛼][§instructions§ ↦→ pre 𝐴 (𝛼)]. Facts in the
     response are collected in 𝐹 . Everything else is ignored.
 P4. An answer set of kb 𝐴 ∪ {𝛼. | 𝛼 ∈ 𝐷 ∪ 𝐹 } is searched, say 𝐼. If an answer set does not exist, the
     process terminates with a failure.
 P5. The LLM is reset and invoked with the prompt post 𝐵 (init).
 P6. If post 𝐴 (_) is defined, the LLM is invoked with the following prompt:
     post 𝐵 (context)[§context§ ↦→ post 𝐴 (_)].
    P7. For each atom 𝑝(𝑡) such that post 𝐴 (𝑝(𝑡)) is defined, the LLM is invoked with the prompt
        post 𝐵 (mapping)[§facts§ ↦→ 𝐼][§atom§ ↦→ 𝛼][§instructions§ ↦→ post 𝐴 (𝛼)]. Responses
        are collected in 𝑅.
    P8. The LLM is invoked with the prompt post 𝐵 (summarize)[§responses§ ↦→ 𝑅]. The response is
        provided in output.
In our implementation, the LLM is the latest iteration of the Meta’s open-source project LLama1 , which
at the time of writing this paper has reached the third version, and the ASP system used is clingo [16].

Example 8. Let 𝐴 be the application file given in Example 7, and 𝐵 be the behavior file reported in
Example 6. Let 𝑇 be the following text:
          I would like some cooking ideas with apples for dessert and a main plate with meat.

The LLM is invoked with the initial fixed prompt of P1, and then with the prompt of P2:
          Here is some context that you MUST analyze and remember.
          The marketplace offers food products. Products and product preferences will be talked about.
          Remember this context and don’t say anything!

After that, the LLM is invoked two times with the prompt of P3 to populate set 𝐹 . For example,
          [USER_INPUT]I would like some cooking ideas with apples for dessert and a main plate with meat.
          [/USER_INPUT]
          List all the products mentioned or requested. If a product is named MUST be listed. Ignore plural,
          always write the product name in singular.
          [ANSWER_FORMAT]product_request("product").[/ANSWER_FORMAT]

The LLM may provide the response
          [ANSWER_FORMAT]product_request("apple").[/ANSWER_FORMAT]
          [ANSWER_FORMAT]product_request("meat").[/ANSWER_FORMAT]

from which the following facts are extracted and added to 𝐹 :
        product_request("apple"). product_request("meat").

Once all facts are collected, the knowledge base is used to search an answer set (P4), say one containing
the following atoms:
        matching_recipe("Apple Pie","apple").'
        matching_recipe("Meat and Onion","meat").
        select("apple","Verza Warehouse",1,9).
        select("meat","Rellman Warehouse",1,11).
        select_for_recipe("Apple Pie","apple","Lattanzi Warehouse",4,8).
        select_for_recipe("Meat and Onion","meat","Rellman Warehouse",5,35).

The LLM is now invoked with the prompts of P5–P7. In particular, the prompt of P7 is the following:

          [FACTS]select_for_recipe("Apple Pie","apple","Lattanzi Warehouse",4,8).
          select_for_recipe("Meat and Onion","meat","Rellman Warehouse",5,35). [/FACTS]
          Each fact matching select_for_recipe("recipe", "warehouse", "quantity", "total") must be
          interpreted as follows: Say to buy "quantity" of "product" for "total" including shipping
          costs from "warehouse" if the customer desire to make "recipe". Do not forget the quantity.

The response, say

          Consider buying 4 apples for 8 from Lattanzi Warehouse if you desire to make Apple Pie.
          Consider buying 5 meat for 35 from Rellman Warehouse if you desire to make Meat and
          Onion.
1
    https://llama.meta.com/llama3/
is added to 𝑅. Finally, the LLM is invoked with the prompt of P8:
      Summarize the following responses: You might enjoy making an Apple Pie with our apples.
      Consider preparing a Meat and Onion dish using our meat. Suggest to select 1 of the apple
      for a cost of 9 from the Verza Warehouse. This price includes the shipping costs. Suggest
      to select 1 of the meat for a cost of 11 from the Rellman Warehouse. This price includes the
      shipping costs. Consider buying 4 apples for 8 from Lattanzi Warehouse if you desire to
      make Apple Pie. Consider buying 5 meat for 35 from Rellman Warehouse if you desire to
      make Meat and Onion.
The response, say
      Why not try making an Apple Pie with our delicious apples, or a savory Meat and Onion
      dish using our fresh meat? You can get 1 apple from Verza Warehouse for $9 or 4 from
      Lattanzi Warehouse for $8, and pair it with 1 meat from Rellman Warehouse for $11 or 5
      for $35!
is provided in output.                                                                               ■
   The main benefit of the new LLMASP version compared to the initial proposal introduced at CILC
2024 [15] is the ease of testing various behavior files. This feature allows for experimentation with
different prompts to manage the interaction and behavior of the LLM.
Example 9. Let us consider again the application file reported in Example 7, and the following behavior
file (different from the one reported in Example 6):
preprocessing:
  init: |
    As an ASP translator, your primary task is to convert natural language
    descriptions, provided in the format [INPUT]input[/INPUT], into precise ASP
    code, outputting in the format [OUTPUT]predicate(terms).[/OUTPUT]. Focus on
    identifying key entities and relationships to create facts (e.g., [INPUT]Alice
    is happy[/INPUT] becomes [OUTPUT]happy(alice).[/OUTPUT]), [INPUT]Bob owns a
    car[/INPUT] becomes [OUTPUT]owns(bob, car)[/OUTPUT], [INPUT]The sky is
    blue[/INPUT] becomes [OUTPUT]color(sky, blue)[/OUTPUT], and [INPUT]Cats are
    mammals[/INPUT] becomes [OUTPUT]mammal(cat)[/OUTPUT]. Ensure that the natural
    language intent is accurately and logically reflected in the ASP code.
    Maintain semantic accuracy by ensuring logical consistency and correctly
    reflecting the natural language intent in your ASP code. Remember these
    instructions and don't say anything!
  context: |
    Here is some context that you MUST analyze and remember.
    §context§
    Remember this context and don't say anything!
  mapping:
    [INPUT]§input§[/INPUT]
    §instructions§
    [OUTPUT]§atom§.[/OUTPUT]

postprocessing:
  init: |
    As an ASP to natural language translator, you will convert ASP facts provided
    in the format [FACTS]atoms[/FACTS] into clear natural language statements
    using predefined mapping instructions. For example,
    [FACTS]happy(alice)[/FACTS] should be translated to "Alice is happy,"
    [FACTS]friend(alice, bob)[/FACTS] to "Alice is friends with Bob," and
    [FACTS]owns(bob, car)[/FACTS] to "Bob owns a car." Ensure each fact is
    accurately and clearly represented in natural language, maintaining the
    integrity of the original information.
    Remember these instructions and don't say anything!
  context: |
    Here is some context that you MUST analyze and remember.
    §context§
    Remember this context and don't say anything!
  mapping: |
    [FACTS]§facts§[/FACTS]
    Each fact matching §atom§ must be interpreted as follows:
    §instructions§
  summarize: |
    Summarize the following responses:
    §responses§

  In this case the prompt used in pre 𝐵 (init) was enriched using in-context learning (ICL),a specific
method of prompt engineering where demonstrations of the task are provided to the model as part of
the prompt (in natural language) [17]. Considering the same input text as Example 8, after invoking the
LLM with the prompt of P3, the provided response is

      [OUTPUT]product_request("apple").[/OUTPUT]
      [OUTPUT]product_request("apple", 0).[/OUTPUT]
      [OUTPUT]product_request("meat").[/OUTPUT]
      [OUTPUT]product_request("meat", 0).[/OUTPUT]

In this case no quantities were specify by the user input, however the model still generate a 0 quantity
request for each product.                                                                             ■

  Continuing with the behavior file discussed in Example 9, let’s now consider a different application:
an intelligent assistant. Still within the context of a marketplace, this assistant will aid in managing
inventory efficiently and effectively.

Example 10. Below is an assistant for monitoring inventory levels and to provide notifications when
it is necessary to replenish products in the warehouse.
preprocessing:
- _: You are an assistant for logistics management. Products, warehouses and
    products stocks will be talked about.
- product_request("product", quantity).: List all the products mentioned or
    requested with a quantity associated. If no quantity is mentioned, assume 1.
    Ignore plural, always write the product name in singular.

knowledge_base: |
  #script(python)
  def min(a, b): return a if a < b else b
  #end.

  % guess selection of products
  {select(P,W,Q',S) : Q' = 1..@min(Q,R), S = Q-Q'} <= 1 :-
    product_request(P,R),
    product_price(P,W,PP),
    warehouse(W),
    warehouse_shipping_cost(W,C),
    product_in_warehouse(P,W,Q).

  % select the correct amount of products
  :- product_request(P,R), #sum{Q,W : select(P,W,Q,_)} != R.

  % minimize shipping cost
  :∼ warehouse_shipping_cost(W,C),
    warehouse_free_shipping(W,T),
     select(_,W,Q,_), Q > 0,
     #sum{Q' * Price,P : select(P,W,Q',_), product_price(P,W,Price)} < T.
     [C@3, W]

  #show select/4.

postprocessing:
- _: |
    You are an assistant for logistics management in an online marketplace, which
    is talking to a manager. Your priority is to keep track of product stocks and
    inventory. Do not mention any product that is not explicitly provided to you
    before. Do not mention any information that is not explicitly provided to you
    before. If there is a 0 quantity associated with a product say that is out of
    stock. Your answers should be suggestions for the manager to keep the
    warehouses full of products. It must guide the manager to place the products
    in the warehouses. Always limit your responses to a maximum of 100 characters.
- select("product", "warehouse", "quantity", "stock").: |
    Suggest that if "product" with "quantity" is selected from "warehouse", the
    remaining "stock" quantity in the warehouse should be tracked.
    Suggest to consider placing more products.

   Given the following configuration, 𝐴 be application file from Example 10 and 𝐵 the behavior file
from Example 9. Let 𝑇 be the following input text:

      What happens if we get and order of 22 apples?

Following the procedure from P1-P8 the final output of the system will be

      Restock apples in Verza Warehouse (out of stock) and consider restocking in Lattanzi
      Warehouse (only 1 left).

   Using YAML configuration files allows for easy and human-readable editing of the system settings,
enabling quick modifications without altering the underlying code. This flexibility facilitates testing
different scenarios and generating varied outputs by simply updating the configuration files.


5. Related Work
In general, LLMs, including ChatGPT [1], PaLM [2], and LLaMa [3], demonstrate an impressive ability
to utilize semantic relationships among tokens within natural language sequences of various lengths.
They excel particularly in sequence-to-sequence (seq2seq) tasks, where a text input triggers a text
output generated by the model. The applications of seq2seq models are diverse, covering areas such as
machine translation [18], answering factual questions [19], executing basic arithmetic operations [20],
text summarization [21], and chatbot functionalities [22]. It is reasonable to expect that these models
could also excel in information extraction tasks.
   The extraction of relational facts from text has been a continuous effort in Natural Language Process-
ing. The capability to identify semantic relationships between entities in text allows one to convert
unstructured raw text into structured data, which can then be utilized in various downstream tasks
and applications, including the development of Knowledge Bases. In [23], the researchers introduce a
sequence-to-sequence model built on BART [24] capable of executing end-to-end extraction for over
200 distinct relation categories. Their approach aligns with a growing trend where seq2seq Transformer
models, such as BART or T5 [25], are employed in NLU tasks, including Entity Linking [26] and Semantic
Role Labeling [27], by redefining them as seq2seq problems. Nowadays, although the most advanced
Transformer-based models, such as GPT-3 and 4, have increased in size and demonstrate outstand-
ing performance in numerous natural language processing tasks, these systems still exhibit limited
reasoning capabilities despite the use of various prompting methods [28]. An alternative perspective,
highlighted by [29], suggests that LLMs are suitable for what Kahneman refers to as System-1 thinking.
This is because they are designed to predict the next word in a sequence without deep comprehension
of crucial reasoning concepts like causality, logic, and probability. Merging LLMs with logical reasoning
into a neurosymbolic framework is an active research field. For example, considering ASP, a recent
study [30] uses a dual phase architecture, incorporated in the NL2ASP tool, to produce ASP programs
from natural language descriptions. NL2ASP utilizes neural machine translation to convert natural
language into Controlled Natural Language (CNL) statements, which are then translated into ASP code
via the CNL2ASP tool. Despite being a well-established approach to bridge the difference between
the flexibility of natural language (NL) and the strictness of formal languages, the application of CNL
[31] still faces limitations regarding its expressiveness [32]. In our approach, we bypass the use of
intermediate languages like CNL. Instead, we implement a method where the users articulate their
intents in NL, and arrange the information more systematically to align with LLM input preferences in
order to minimize irrelevant responses. It is important to highlight that our task is less complex than
the one tackled by [30]. Unlike [30], which aims to convert textual information into a formal knowledge
base, our focus is directed toward a more straightforward task that involves the input and output of the
ASP solver, without diving into deep semantic-demanding translation of the application domain. In
addition, combining both methods could lessen the effort required from the end user.
   Attempting a similar objective as [30], but using a prompt engineering strategy, [33] suggests carefully
crafting prompts for an LLM to transform natural language descriptions into ASP incrementally. They
demonstrate that, with just a handful of in-context learning instances, LLMs are capable of producing
fairly complex answer set programs. The proposed pipeline initially identifies the relevant objects and
their categories. Subsequently, it forms a predicate that delineates the relationships between objects
from various categories. Using these derived data, the pipeline proceeds to build an ASP program
following the Generate-Define-Test paradigm.
   The main idea of [33] involves using LLM as an interface for answer set programming, thereby
leveraging their language processing capabilities to convert NL descriptions into the declarative syntax
of answer set programs. In contrast to approaches that rely on algorithms or machine learning methods,
the authors observe that LLMs, when provided with an effective prompt, can produce fairly accurate
answer set programs. Although we share with [33] the insight that effective prompt engineering is
adequate for initial entity and relation extraction, we depart from their findings by asserting that the
complexity of real-world applications necessitates a more structured approach to ASP than the pipeline
proposed in their work. Additionally, because the authors appear not to make any specific assumptions
about the problem description required to generate the final program with LLM, we aim to explore in
future research the induction of logic programs from unstructured text using LLMs.
   The closest related work to our goal that we found in the literature is [34]. In this work, the authors
focus primarily on the Question Answering (QA) task, using an LLM to transform a problem description
(comprising context S and query q) into atomic facts. Subsequently, the ASP solver processes these facts
along with background knowledge represented as ASP rules to derive an answer a. This method does not
rely on training datasets2 . Instead, only a small number of examples were used as few-shot contextual
data for the LLM input. Supplying this meta-information about the problem context consequently
facilitates the translation of natural language sentences into atomic facts, thereby enhancing the LLMs’
semantic parsing abilities. We set our work apart by proposing a more organized method for the prompt
engineering task that does not rely on few-shot examples. However, integrating contextual few-shot
prompting into our framework is feasible at a low cost; users simply need to create a small number of
examples based on the complexity of the problem to guide the LLM. We will explore this possibility in
future work.
   Finally, despite their individual successes, the integration of LLMs and ASP remains relatively
unexplored. This synergy presents a unique opportunity to leverage the strengths of both paradigms:
LLMs for their linguistic and contextual capabilities, and ASP for its precise, logic-based reasoning.
By combining these technologies, we can create powerful AI systems capable of both understanding
complex natural language inputs and performing sophisticated logical inference [35, 36].

2
    Although, the authors still have access to labeled QA datasets in which tuple ⟨𝑆, 𝑞, 𝑎⟩ is given.
6. Conclusion
In this paper, we have introduced an approach for combining Large Language Models (LLMs) and Answer
Set Programming (ASP) to harness their complementary strengths in natural language understanding
and logical reasoning. Our prototype system (https://github.com/lewashby/llmasp) is written in Python,
and it is powered by the LLM llama3:70b from ollama3 and the clingo Python API [37]. By providing
predefined prompts and enriching specifications with domain-specific knowledge, our approach enables
users to tailor the system to diverse problem domains and applications, enhancing its adaptability and
versatility. With respect to the previous version of the system [15], the predefined prompts can be easily
modified as they are stored in a separate YAML file. Such a separation is a first step toward our future
research directions, which encompass the evaluation of the quality of the answers provided by our
integrated LLMASP system, and the exploration with different prompts to improve the overall quality
of the system.


Acknowledgments
This work was supported by Italian Ministry of University and Research (MUR) under PRIN project PRODE
“Probabilistic declarative process mining”, CUP H53D23003420006, under PNRR project FAIR “Future AI Research”,
CUP H23C22000860006, under PNRR project Tech4You “Technologies for climate change adaptation and quality
of life improvement”, CUP H23C22000370006, and under PNRR project SERICS “SEcurity and RIghts in the
CyberSpace”, CUP H73C22000880001; by Italian Ministry of Health (MSAL) under POS projects CAL.HUB.RIA
(CUP H53C22000800006) and RADIOAMICA (CUP H53C22000650006); by Italian Ministry of Enterprises and Made
in Italy under project STROKE 5.0 (CUP B29J23000430005); and by the LAIA lab (part of the SILA labs). Alviano
is member of Gruppo Nazionale Calcolo Scientifico-Istituto Nazionale di Alta Matematica (GNCS-INdAM).


References
    [1] T. B. Brown, et al., Language models are few-shot learners, CoRR abs/2005.14165 (2020). URL:
        https://arxiv.org/abs/2005.14165. arXiv:2005.14165.
    [2] A. Chowdhery, et al., Palm: Scaling language modeling with pathways, J. Mach. Learn. Res. 24
        (2023) 240:1–240:113. URL: http://jmlr.org/papers/v24/22-1144.html.
    [3] H. Touvron, et al., Llama: Open and efficient foundation language models, CoRR abs/2302.13971
        (2023). doi:10.48550/ARXIV.2302.13971. arXiv:2302.13971.
    [4] V. Marek, M. Truszczyński, Stable models and an alternative logic programming paradigm,
        in: The Logic Programming Paradigm: a 25-year Perspective, 1999, pp. 375–398. doi:10.1007/
        978-3-642-60085-2_17.
    [5] I. Niemelä, Logic programming with stable model semantics as a constraint programming
        paradigm, Annals of Mathematics and Artificial Intelligence 25 (1999) 241–273. doi:10.1023/A:
        1018930122475.
    [6] M. Gelfond, V. Lifschitz, Logic programs with classical negation, in: D. Warren, P. Szeredi (Eds.),
        Logic Programming: Proc. of the Seventh International Conference, 1990, pp. 579–597.
    [7] H. Jin, Y. Zhang, D. Meng, J. Wang, J. Tan, A comprehensive survey on process-oriented automatic
        text summarization with exploration of llm-based methods, CoRR abs/2403.02901 (2024). doi:10.
        48550/ARXIV.2403.02901. arXiv:2403.02901.
    [8] W. Zhang, X. Li, Y. Deng, L. Bing, W. Lam, A survey on aspect-based sentiment analysis: Tasks,
        methods, and challenges, IEEE Trans. Knowl. Data Eng. 35 (2023) 11019–11038. doi:10.1109/
        TKDE.2022.3230975.
    [9] P. Cappanera, M. Gavanelli, M. Nonato, M. Roma, Logic-based benders decomposition in answer
        set programming for chronic outpatients scheduling, Theory Pract. Log. Program. 23 (2023)
        848–864. doi:10.1017/S147106842300025X.
3
    https://ollama.com/
[10] M. Cardellini, P. D. Nardi, C. Dodaro, G. Galatà, A. Giardini, M. Maratea, I. Porro, Solving
     rehabilitation scheduling problems via a two-phase ASP approach, Theory Pract. Log. Program.
     24 (2024) 344–367. doi:10.1017/S1471068423000030.
[11] F. Wotawa, On the use of answer set programming for model-based diagnosis, in: H. Fujita,
     P. Fournier-Viger, M. Ali, J. Sasaki (Eds.), Trends in Artificial Intelligence Theory and Applications.
     Artificial Intelligence Practices - 33rd International Conference on Industrial, Engineering and
     Other Applications of Applied Intelligent Systems, IEA/AIE 2020, Kitakyushu, Japan, September
     22-25, 2020, Proceedings, volume 12144 of Lecture Notes in Computer Science, Springer, 2020, pp.
     518–529. doi:10.1007/978-3-030-55789-8_45.
[12] R. Taupe, G. Friedrich, K. Schekotihin, A. Weinzierl, Solving configuration problems with ASP and
     declarative domain specific heuristics, in: M. Aldanondo, A. A. Falkner, A. Felfernig, M. Stettinger
     (Eds.), Proceedings of the 23rd International Configuration Workshop (CWS/ConfWS 2021), Vienna,
     Austria, 16-17 September, 2021, volume 2945 of CEUR Workshop Proceedings, CEUR-WS.org, 2021,
     pp. 13–20. URL: https://ceur-ws.org/Vol-2945/21-RT-ConfWS21_paper_4.pdf.
[13] K. Basu, S. C. Varanasi, F. Shakerin, J. Arias, G. Gupta, Knowledge-driven natural language
     understanding of english text and its applications, in: Thirty-Fifth AAAI Conference on Ar-
     tificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Arti-
     ficial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial
     Intelligence, EAAI 2021, Virtual Event, February 2-9, 2021, AAAI Press, 2021, pp. 12554–12563.
     doi:10.1609/AAAI.V35I14.17488.
[14] Y. Zeng, A. Rajasekharan, P. Padalkar, K. Basu, J. Arias, G. Gupta, Automated interactive domain-
     specific conversational agents that understand human dialogs, in: M. Gebser, I. Sergey (Eds.),
     Practical Aspects of Declarative Languages - 26th International Symposium, PADL 2024, London,
     UK, January 15-16, 2024, Proceedings, volume 14512 of Lecture Notes in Computer Science, Springer,
     2024, pp. 204–222. doi:10.1007/978-3-031-52038-9_13.
[15] M. Alviano, L. Grillo, Answer set programming and large language models interaction with yaml:
     Preliminary report, in: CILC, CEUR Workshop Proceedings, CEUR-WS.org, 2024.
[16] M. Gebser, R. Kaminski, T. Schaub, Complex optimization in answer set programming, Theory
     Pract. Log. Program. 11 (2011) 821–839.
[17] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan,
     P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child,
     A. Ramesh, D. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray,
     B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, D. Amodei, Lan-
     guage models are few-shot learners, in: H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan,
     H. Lin (Eds.), Advances in Neural Information Processing Systems, volume 33, Curran Asso-
     ciates, Inc., 2020, pp. 1877–1901. URL: https://proceedings.neurips.cc/paper_files/paper/2020/file/
     1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf.
[18] R. Dabre, C. Chu, A. Kunchukuttan, A survey of multilingual neural machine translation, ACM
     Computing Surveys (CSUR) 53 (2020) 1–38.
[19] F. Petroni, T. Rocktäschel, P. Lewis, A. Bakhtin, Y. Wu, A. H. Miller, S. Riedel, Language models as
     knowledge bases?, arXiv preprint arXiv:1909.01066 (2019).
[20] J. Hoffmann, S. Borgeaud, A. Mensch, E. Buchatskaya, T. Cai, E. Rutherford, D. d. L. Casas, L. A.
     Hendricks, J. Welbl, A. Clark, et al., Training compute-optimal large language models, arXiv
     preprint arXiv:2203.15556 (2022).
[21] H. Zhang, J. Xu, J. Wang, Pretraining-based natural language generation for text summarization,
     arXiv preprint arXiv:1902.09243 (2019).
[22] Z. Liu, M. Patwary, R. Prenger, S. Prabhumoye, W. Ping, M. Shoeybi, B. Catanzaro, Multi-stage
     prompting for knowledgeable dialogue generation, in: Findings of the Association for Computa-
     tional Linguistics: ACL 2022, 2022, pp. 1317–1337.
[23] P.-L. H. Cabot, R. Navigli, Rebel: Relation extraction by end-to-end language generation, in:
     Findings of the Association for Computational Linguistics: EMNLP 2021, 2021, pp. 2370–2381.
[24] M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov, L. Zettlemoyer,
     Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation,
     and comprehension, in: Proceedings of the 58th Annual Meeting of the Association for Computa-
     tional Linguistics, 2020, pp. 7871–7880.
[25] C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, P. J. Liu, Exploring
     the limits of transfer learning with a unified text-to-text transformer, Journal of machine learning
     research 21 (2020) 1–67.
[26] N. De Cao, G. Izacard, S. Riedel, F. Petroni, Autoregressive entity retrieval, in: ICLR 2021-9th
     International Conference on Learning Representations, volume 2021, ICLR, 2020.
[27] R. Blloshmi, S. Conia, R. Tripodi, R. Navigli, et al., Generating senses and roles: An end-to-end
     model for dependency-and span-based semantic role labeling, in: Proc. of 30th International Joint
     Conference on Artificial Intelligence, IJCAI 2021, 2021, pp. 3786–3793.
[28] J. Wei, X. Wang, D. Schuurmans, M. Bosma, F. Xia, E. Chi, Q. V. Le, D. Zhou, et al., Chain-of-thought
     prompting elicits reasoning in large language models, Advances in neural information processing
     systems 35 (2022) 24824–24837.
[29] M. Nye, M. Tessler, J. Tenenbaum, B. M. Lake, Improving coherence and consistency in neural
     sequence models with dual-system, neuro-symbolic reasoning, Advances in Neural Information
     Processing Systems 34 (2021) 25192–25204.
[30] M. Borroto, I. Kareem, F. Ricca, Towards automatic composition of asp programs from natural
     language specifications, arXiv preprint arXiv:2403.04541 (2024).
[31] T. Kuhn, A survey and classification of controlled natural languages, Computational linguistics 40
     (2014) 121–170.
[32] R. Schwitter, Controlled natural languages for knowledge representation, in: Coling 2010: Posters,
     2010, pp. 1113–1121.
[33] A. Ishay, Z. Yang, J. Lee, Leveraging large language models to generate answer set programs, in:
     Proceedings of the 20th International Conference on Principles of Knowledge Representation and
     Reasoning, 2023, pp. 374–383.
[34] Z. Yang, A. Ishay, J. Lee, Coupling large language models with logic programming for robust and
     general reasoning from text, in: Findings of the Association for Computational Linguistics: ACL
     2023, 2023, pp. 5186–5219.
[35] A. Ishay, Z. Yang, J. Lee, Leveraging Large Language Models to Generate Answer Set Programs,
     in: Proceedings of the 20th International Conference on Principles of Knowledge Representation
     and Reasoning, 2023, pp. 374–383. URL: https://doi.org/10.24963/kr.2023/37. doi:10.24963/kr.
     2023/37.
[36] A. Rajasekharan, Y. Zeng, G. Gupta, Argument analysis using answer set programming and
     semantics-guided large language models., in: ICLP Workshops, 2023.
[37] R. Kaminski, J. Romero, T. Schaub, P. Wanko, How to build your own asp-based system?!, Theory
     and Practice of Logic Programming 23 (2023) 299–361. doi:10.1017/S1471068421000508.