<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Exploiting Large Language Models for the Automated Generation of Constraint Satisfaction Problems</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Lothar Hotz</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christian Bähnisch</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sebastian Lubos</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alexander Felfernig</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Albert Haag</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Johannes Twiefel</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Graz University of Technology</institution>
          ,
          <addr-line>Graz</addr-line>
          ,
          <country country="AT">Austria</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Hamburger Informatik Technologie-Center e.V.</institution>
          ,
          <addr-line>Hamburg</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Product Management Haag GmbH</institution>
          ,
          <addr-line>Bad Dürkheim</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>exXxa GmbH</institution>
          ,
          <addr-line>Hamburg</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Constraint Satisfaction Problems (CSPs) are a core technology that solves many real-world problems, especially for configuration tasks. A key success factor in this context is an eficient knowledge acquisition process where domain experts and knowledge engineers (developers of CSPs) should develop an agreement on the correctness of the expanding knowledge base as soon as possible. In this paper, we show how large language models (LLMs) can be applied to the automated generation of solutions for constraint satisfaction problems thus reducing overheads related to CSP development and maintenance in the future.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Constraint Satisfaction Problems</kwd>
        <kwd>Large Language Models</kwd>
        <kwd>Knowledge Acquisition</kwd>
        <kwd>Automated Generation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        ing legacy code [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. This is achieved through general
      </p>
      <sec id="sec-1-1">
        <title>LLMs such as ChatGPT1 or specific ones like for cod</title>
        <p>
          Knowledge acquisition for knowledge-based systems, es- ing trained LLMs such as CoPilot2, or software agents
pecially constraint-based systems, is a complex task. It like CREWAI3 and AutoGenStudio4, or software assisting
includes the formalization of partly tangible knowledge engineers such as Devine AI5 or its open source
correwith a knowledge model, such as a configuration, con- spondent Devika6.
straint, or feature model [
          <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
          ]. Constrained-based sys- An example of developing a small game with a
graphtems, in particular, are often used to implement configu- ical user interface is GPT-Engineer7, which prompts in
ration systems, due to their ability to compute possible natural languages and provides the game logic and the
values for configurations or even directly configurations graphical interface through HTML and CSS coding.
Esthemselves [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. Hence, modeling a configuration prob- pecially the combination of an LLM with a compiler or
lem as a constraint satisfaction problem (CSP) is a typical interpreter in a chain leads to a self-evaluation where
approach for computing configurations. This type of a code, which is generated code by an LLM, is directly
modeling is similar to developing programs in a high- checked through the interpreter, and the resulting
erlevel programming language but also incorporates logical ror message is taken as the next prompt for the LLM.
semantic elements. Through iterating this (e.g., with a supporting tool such
        </p>
        <p>Currently in software engineering, the use of large as LangCHAIN8) a syntactically correct program is
crelanguage models (LLMs) is exploited to support program- ated. However, broader tasks, such as reviewing a
softmers in their daily tasks, such as coding, code comple- ware architecture or refactoring code distributed over
tion, reviewing code, API programming, generating test multiple files are hindered by the limited size of a prompt
cases, documentation, or identifying design patterns as (the context window) to a given number of tokens such
well as learning programming languages or understand- as 128, 000 input tokens for GPT-4. In summary,
applying LLMs to software development still has to cope with
issues such as limited context windows, maintainability
of AI-created code, and applicability in project planning
or software architecture.</p>
        <sec id="sec-1-1-1">
          <title>Inspired by these approaches, for constraint satisfaction problems the following questions arise: then reuse this general prompt for specific knowledge engineering tasks at hand.</title>
          <p>However, a main ingredient of LLMs is of course a
prompt that triggers the generation of text, here of
knowledge models. Hence, for a knowledge model, the
questions are also:
• What type of prompts do we need?
• What has to be presented to the LLM?
• Is natural language text describing the problem
enough?
• Should examples of the formal modeling
language, which shall be used for formalization, be
given?
• Or the complete language specification?</p>
          <p>This paper explores the innovative use of LLMs to
automatically generate constraint models in constraint
programming languages like PyChoco9. Constraint
programming involves defining problems in terms of
variables, related variable domains, and constraints that must
be satisfied. Popular problems in this domain include, e.g.,
the N-queens problem, Magic Square, Map Coloring, and
the Traveling Salesperson Problem (TSP). To tackle such
problems, we propose a generalized approach (Section</p>
        </sec>
        <sec id="sec-1-1-2">
          <title>3) where an LLM is leveraged to generate a constraint</title>
          <p>model based on a provided prompt.</p>
          <p>We test our approach based on several constraint
satisfaction problem examples (Section 4). In this paper, we
mainly concentrate on known examples from the
constraint community given by well-known descriptions,
not that much on new examples formulated in natural
language. Although the task of starting with known
problems is untypical for configuration problems, in our view,
this step is necessary for exploring the possibilities of</p>
        </sec>
        <sec id="sec-1-1-3">
          <title>LLMs for knowledge modeling in general. This approach</title>
          <p>of using commonly known constraint and configuration
problems builds a baseline for further work.</p>
          <p>With our proposed approach, we aim to streamline the
process of model creation in constraint programming,
making it faster and more accessible by harnessing the
capabilities of LLMs. The main idea is to define a general
prompt once which describes the problem context and</p>
        </sec>
        <sec id="sec-1-1-4">
          <title>9https://pypi.org/project/pychoco/</title>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. State of the Art in LLMs and</title>
    </sec>
    <sec id="sec-3">
      <title>Knowledge Modeling</title>
      <p>
        Many real-world problems can be interpreted as
constraint satisfaction problems (CSPs). Knowledge
engineers have the task of formalizing domain-specific
constraints into corresponding constraint-based
representations. This formalization step is often efortful and
more assistance and automation are required in the
modeling process which can be regarded as a major challenge
for constraint-based systems [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Due to recent
developments in the area of large language models (LLMs), we
could envision agent-based approaches that support the
(semi-)automated generation of CSPs [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In the
following, we discuss related work on the application of LLMs
to knowledge modeling.
      </p>
      <sec id="sec-3-1">
        <title>The idea of exploiting synergy potentials of</title>
        <p>knowledge-based systems and language models for
knowledge-base generation has been proposed a.o. by</p>
      </sec>
      <sec id="sec-3-2">
        <title>Petroni et al. [7], Ding et al. [8], and Razniewski et al. [9].</title>
        <p>
          Suchanek and Luu [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] motivate the integration of LLMs
with data-driven and knowledge-based approaches to
exploit the advantages of both worlds, specifically, to
let data-driven approaches act as a basis for grounding
the LLM output in reality. In the work presented in our
paper, a simple form of grounding is the usage of test
cases to assure intended CSP semantics. A similar line of
research is presented in Nayak and Timmapathini [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]
where object and relation identification based on LLMs
is investigated.
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Ahmed and Choudhury [12] introduce the idea of ap</title>
        <p>plying LLMs to the generation of optimization problems.</p>
        <p>
          The motivation behind this is to make related problem
formulation tasks more accessible to domain experts, i.e.,
to decrease the need for specialized mathematical
knowledge to make problem formulations feasible. Based on
a given dataset comprising diferent example problem
definitions (and solutions), LLMs are used to generate
optimization problem definitions for new problems defined
in a textual fashion. A major diference compared to our
work is that model fine-tuning is primarily based on a
"human in the loop" approach whereas we focus more
on automated fine-tuning giving LLM feedback on the
syntactical (is the generated CSP correct or what were and the process repeats. This loop continues until the
the compilation issues?) and the semantic level with test syntax is correct.
cases. A related approach is the application of LLMs Once a syntactically correct model is produced, the
to support diferent types of strategic reasoning tasks, next step is to verify its semantics. This is done by
runfor example, in the context of economic simulations and ning a series of test cases specific to the problem domain.
game theory [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. Successful completion of test cases indicates a correct
        </p>
        <p>
          A very similar approach provides [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. This ‘Program semantic implementation (at least concerning the test
of Thoughts’ named approach also combines an LLM cases).
with an external Python interpreter, however, in [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] for This approach, thus, executes an iterative prompting
computing numerical problems. Furthermore, the verifi- of an LLM, where only a human user provides the first
cation of the generated Python program is not discussed. prompt, while the results of a Python interpreter are
        </p>
        <p>
          The adaptation of LLMs for logical reasoning tasks [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] automatically used by the LLM as successive prompts. A
can be performed either based on LLM fine-tuning where complete example of such an iterative dialog including
LLMs are trained (and adapted) for specific problem set- in-between generated answers of the LLM (”Assistant”)
tings [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] or based on (automated) prompt adaptation is presented in the Appendix Section 2.
[
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. Intending to improve the quality of code generation, This methodology can be further detailed through the
        </p>
      </sec>
      <sec id="sec-3-4">
        <title>Pan et al. [15] follow the idea of using constraint (and following steps:</title>
        <p>
          SAT) solver execution errors as feedback channels (in the Leveraging Existing Knowledge Problems: To
creprompting process) to increase the quality of code gen- ate useful constraint models, we use logical problems
eration. The feedback approach presented in our paper that are well understood and documented on the internet.
extends the work of Pan et al. [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] in terms of addition- Examples include classic problems such as the N-queens
ally providing test cases that help to improve the quality problem, Map Coloring, or the Traveling Salesperson
of the generated CSP also on the semantic level. Problem. These problems are selected due to their
well
        </p>
      </sec>
      <sec id="sec-3-5">
        <title>In [18], various prompts are used to create source code known constraints and solution strategies.</title>
        <p>with built-in variability. Specific prompts enable the gen- Integration with Constraint Solvers: Once the
logeration of code in diferent programming languages, al- ical problems are defined, we connect these models to a
lowing for application configuration. However, a human constraint system, here PyChoco - other suitable solvers
developer would execute these prompts, no iterative pro- could be used. This involves mapping the logical
probcess ensures the correctness of the resulting code. lem’s constraints to the syntax and structures of the
cho</p>
        <p>
          The automated generation of CSPs can also be re- sen constraint language and, thus, facilitating automated
garded as a specific type of ontology construction where solving.
concept hierarchies are derived based on diferent LLM Syntax Verification Loop: An iterative process is set
prompts representing concept hierarchy-specific queries up where the LLM generates a constraint model based
(see, for example, the work of Funk et al. [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]). An exam- on the initial prompt. This model is then verified for
ple of such a query is the following: What are all of the syntactic correctness using the constraint interpreter or
most important subcategories of the category A? compiler. If errors are identified, the prompt is
automatically adjusted by the LLM, by using an error message
as a next prompt and the LLM regenerates the model.
3. Methodology This loop continues until a syntactically correct model is
achieved.
        </p>
        <p>LOLuMr sapisparosatrcuhctfuorredg,einteerraattiivnegpcrooncesstrsadinestigmnoeddetlos curseiantge achSieemvinangtaicsyVntearcitficiacatilloyncowrrietcht mToedstel,Cthaesefso:cus sAhfitfetsr
syntactically and semantically (almost) correct models. to ensure semantic correctness. This involves
introducThis begins by defining a general LLM prompt that can ing test cases specific to the defined problem. The
LLMencompass the varying specifics of diferent constraint generated model runs through these test cases to
verproblems. For instance, the prompt can be tailored to a ify that it not only adheres to syntactic rules but also
specific constraint problem but always solicits a PyChoco correctly implements the problem’s logic. Iteration
coninterpreter that solves the problem. tinues until the model passes all semantic verification</p>
        <p>The process, furthermore, involves using an iterative tests.
loop where the LLM generates a constraint model based Generalizability: The ultimate goal of our approach
on the provided prompt. This generated model is then is to establish a generalizable method for generating
converified through a constraint interpreter or compiler. If straint models using LLMs. By systematically using an
the syntax of the model is incorrect, the result of the iterative loop for both syntactic and semantic verification
Python interpreter call is automatically taken as a prompt that employs examples and logical problem definitions,
for a next iteration and the model is adjusted by the LLM we aim to create a robust framework that can be adapted
tool Python_REPL, description: A Python shell. Use this to run and verify your solution to the constraint problem.
Input should be a valid Python script using pychoco. The last line should be "verify_solutions(solutions, ...)" in order
to verify your solutions
System:
You are an agent designed to solve constraint problems, you write python code to solve and verify constraint problems.
You have access to a python REPL, which you can use to execute python code and verify your solution. If you get an
error, debug your code and try again. You only use the constraint solver pychoco. You conclude your python code
with a variable "solutions" and verify it with the method call "verify_solutions(solutions, &lt;problem_name&gt;", where
&lt;problem_name&gt; is given to you as the first word of the instructions. "solutions" should be a list of all solutions to the
constraint problem. If verification fails, you correct your solution.</p>
        <p>Use the following example as a template for solving constraint problems with pychoco:
User: ints10. find numbers one to 10.</p>
        <p>Assistant:
‘‘‘python
from pychoco.model import Model
from test_tool import verify_solutions
model = Model()
x = model.intvar(1, 10, "x")
solver = model.get_solver()
solutions = [s.get_int_val(x) for s in solver.find_all_solutions()]
verify_solutions(solutions, "ints10")
‘‘‘
to various constraint languages and problems.</p>
      </sec>
      <sec id="sec-3-6">
        <title>By following these steps, our approach leverages the capabilities of LLMs to create accurate and eficient constraint models, providing a scalable solution for diferent problem types.</title>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Prompting Constraint Models</title>
      <sec id="sec-4-1">
        <title>We divide prompts into a general part which is used</title>
        <p>with each constraint problem, i.e., the system prompt (see</p>
      </sec>
      <sec id="sec-4-2">
        <title>Section 4.1), and a query part which points to a specific</title>
        <p>problem (see Section 4.2).
4.1. General Prompts
Prompting Syntactic Valid Models Figure 1 shows the
general prompt which we developed for querying for
constraint solutions. The prompt provides the following
explanations to the LLM:</p>
      </sec>
      <sec id="sec-4-3">
        <title>1. what role it has (“You are an agent. . . “),</title>
        <sec id="sec-4-3-1">
          <title>2. the explanation of the REPL10 as a tool (“You have</title>
          <p>access. . . “),
10Read-Eval-Print-Loop</p>
        </sec>
      </sec>
      <sec id="sec-4-4">
        <title>3. a hint of the behavior of the tool (“The python</title>
        <p>REPL. . . “),</p>
      </sec>
      <sec id="sec-4-5">
        <title>4. a hint of how to use the tool (“Use this. . . “),</title>
      </sec>
      <sec id="sec-4-6">
        <title>5. to focus on that tool (“use only“).</title>
        <p>Prompting Semantic Verification: To verify, if a
semantic correct solution can be computed with the
generated constraint model, we include a verification scheme
that builds on tests. Those are specific for the constraint
problem at hand. However, the general prompt is
enhanced with a call to the verification method, which is
parameterized with the name of the constraint problem.</p>
      </sec>
      <sec id="sec-4-7">
        <title>The verification scheme is given by an example that ex</title>
        <p>plains the parameterization, a constraint model, and the
call to the verification method (Figure 1).</p>
      </sec>
      <sec id="sec-4-8">
        <title>Prompt Refinement: For more understanding, of</title>
        <p>why the LLM generates a certain code line in the model,
we have enforced the LLM to provide comments above
a code line (see Figure 2). This is done by explaining
in the prompt what a “Good python code example“11 is.</p>
      </sec>
      <sec id="sec-4-9">
        <title>This is done, by repeating this text at various appropriate</title>
        <p>11The italic style in the prompt is only for the human reader, i.e.,</p>
        <p>only the text is given as prompt.</p>
        <p>You are an agent designed to solve constraint problems, you write and execute python code to answer questions.
You have access to a python REPL, which you can use to execute python code. The python REPL will keep its state
between usage. Use this to gradually approach a final solution. Divide the solution into meaningful parts run them
part after part verifying that each part runs correctly. If you get an error, debug your code and try again only the last
part that failed. You only use the constraint solver pychoco. Only use the output of your code to answer the question.
Write for each code line a comment as needed to justify your reasoning for that code line. I will give you an example
of good python code, please, follow the code convention of the good python code example. This good python code
exampleis only an example. This good python code example has nothing to do with the constraint problem solution.
You might know the answer without running any code, but you should still run the code to get the answer. If it does
not seem like you can write code to answer the question, just return "I don’t know" as the answer. The user may
ask questions or give follow up instructions after you presented your solution. You then have to adjust you solution
accordingly.
good python code example:
"# Import necessary modules
import os
# Define the input and output file paths
input_file_path = \’numbers.txt\’
output_file_path = \’average.txt\’
# Function to read numbers from a file
def read_numbers_from_file(file_path):
# Initialize an empty list to store the numbers
numbers = []
# Open the file in read mode
with open(file_path, \’r\’) as file:
# Read each line in the file
for line in file:
# Strip any leading/trailing whitespace and convert to float
number = float(line.strip())
# Append the number to the list
numbers.append(number)
# Return the list of numbers
return numbers
main()"
12The code in the repository contains further problems such as magic</p>
        <p>square that lead to similar observations.
13See files “chats/queen8_cs_agent_sol2_gpt-4-1106-preview_*“ in</p>
        <p>the repository for results
positions in the prompt, as well as providing such an Results: The LLM makes multiple Python REPL calls,
example in the prompt. one that prints the solution, and one that follows the
4.2. Prompting Specific Problems system prompt on how to verify the solution. This leads
to the correct computation of the model including the
The query part simply consists of the name of the con- conversion of the solutions to a list of lists (see Figure 3).
straint problem and related specific problem descriptions, Comments: The solution to the N-queens problem is
see Listing 112. provided in the PyChoco documentation on the internet,
N-Queens Problem hence, the LLM was trained with it. However, a diferent
Problem: Positioning n queens with possible movements solution was found. Also, the 3-queen problem which
known from chess on a *  chessboard13. Figure 1 shows provides no solution was correctly represented. A further
the used prompt. observation was that the result is not deterministic, e.g., it
varies arithmetical constraints for diagonals (see Listing</p>
      </sec>
      <sec id="sec-4-10">
        <title>2). Fixing the seed and setting the temperature to zero does not make results deterministic.</title>
      </sec>
      <sec id="sec-4-11">
        <title>The semantic verification test for the N-Queens problem is shown in Listing 3. Through the method</title>
        <p>
          Listing 1: Queries for specific constraint problems
queen3 = "solve the 3-queen problem"
queen8 = "solve the 8-queen problem"
queenn = "solve the n-queen problem, use n=8 as test instance"
---------------------------------------------------------------------------coloring = """
solve the map coloring problem for four regions, three colors and the given adjacency:
regions = [’A’, ’B’, ’C’, ’D’]
adjacency_list = {
’A’: [’B’, ’C’],
’B’: [’A’, ’C’, ’D’],
’C’: [’A’, ’B’, ’D’],
’D’: [’B’, ’C’]
}.
the solution should be a list of python dicts where each dict maps regions to color indices
"""
---------------------------------------------------------------------------tsp = """solve the traveling salesman problem, use the following problem instance:
# Number of cities
C = 4
# Distance matrix
D = [[
          <xref ref-type="bibr" rid="ref10 ref15 ref20">0, 10, 15, 20</xref>
          ], [
          <xref ref-type="bibr" rid="ref10">10, 0, 35, 25</xref>
          ], [
          <xref ref-type="bibr" rid="ref15">15, 35, 0, 30</xref>
          ], [
          <xref ref-type="bibr" rid="ref20">20, 25, 30, 0</xref>
          ]]
the solution should be a list of valid solutions, each solution being a list of integers
representing the cities to be visited.
"""
verify_solutions called by the prompt, each solu- Results: In general, there are several recurring issues
tion (queens) is tested. This test is exhaustive because when attempting to model constraints for the TSP (see
it analytically tests the queens’ position, not leaving out the error logs in Appendix Section 8). One major problem
a constraint. is accurately modeling distance constraints, which often
Map Coloring leads to incorrect solutions. Similarly, the Hamiltonian
Problem: Listing 1 describes the well-known Map Col- path constraints are dificult to model correctly,
comoring as a query. pounding the problem. Additionally, there is frequent
        </p>
        <p>Results: A first model which contains the constraint misuse of the API. These misuses typically involve
proall_different raised an error14 but was corrected by viding arguments in the wrong order or supplying an
inusing another constraint modeling (i.e., with != instead correct number of arguments. As a result of these issues,
of all_different) through 2 iterations15. A further the solutions generated are often flawed. With GPT-4, no
run provides a one-shot success, however not recognized solutions could be found with the resources at hand, but
as such, instead a solution was hallucinated for present- with GPT-4o, some solutions were correctly identified.
ing to the user16. A further run firstly used the API However, no implementation was found that computes
wrongly, but finds finally a correct solution 17. all the test cases provided for verification. Overall, these</p>
        <p>Comments: The solution can also be found on the challenges underscore the complexity of efectively
modinternet, however, this does not prevent from making eling TSP constraints and highlight areas that require
errors. Finally, also because the iterative approach is significant improvements to find optimal solutions
conused, a solution was found. sistently.</p>
        <p>Traveling Salesperson Problem
Problem: Listing 1 describes the well-known TSP as a 4.3. Implementation
query.</p>
      </sec>
      <sec id="sec-4-12">
        <title>The implementation, which utilizes LangChain and</title>
        <p>14It seems to be an implementation issue, see https://github.com/ Python, leverages ChatGPT-4 and ChatGPT-4o and is
chocoteam/choco-solver/issues/705 available on a repository at Zenodo18. As a constraint
15coloring_cs_agent_sol2_gpt-4-1106-preview_ 20240529094847.txt
16coloring_cs_agent_sol2_gpt-4-1106-preview_ 20240529095157.txt
17coloring_cs_agent_sol2_gpt-4-1106-preview_ 20240529095612.txt 18https://doi.org/10.5281/zenodo.12794211
Assistant:
There was a mistake in the code. It seems I incorrectly used the
‘solver.find_solution()‘ method, which only finds one solution, and I
didn’t convert the solution to the expected list format before passing
it to the verification function. I need to find all solutions and convert
each solution to a list of integers representing the positions of the
queens.</p>
        <p>Let me correct the code.</p>
        <p>Python_REPL (tool call 1/1):
from pychoco.model import Model
from test_tool import verify_solutions
model = Model("8-Queens")
# Create an array of 8 variables representing the positions of queens on a
# chess board
queens = model.intvars(8, 1, 8)
# Add constraints
for i in range(8):
for j in range(i + 1, 8):
# Queens cannot be on the same row
model.arithm(queens[i], "!=" , queens[j]).post()
# Queens cannot be on the same diagonal
model.arithm(queens[i], "-", queens[j], "!=", i - j).post()
model.arithm(queens[i], "-", queens[j], "!=", j - i).post()
# Find all solutions
solver = model.get_solver()
solutions = []
for solution in solver.find_all_solutions():</p>
        <p>solutions.append([solution.get_int_val(queens[i]) for i in range(8)])
verify_solutions(solutions, "queen8")
Python_REPL (tool response):
92 of the 92 solutions are correct. Expected 92 solutions
Assistant:
The 8-queens problem has been successfully solved, and all 92 solutions are correct.</p>
        <p>tool calls: 6, text messages: 6
solver, we used PyChoco as a Python API for the con- pute a single solution could be found (see Appendix
secstraint solver Choco19. The resources of ChatGPT are tions 3 to 6). In Appendix Section 9, we document the
limited in respect of context/prompt length. This ap- number of iterations per experiment, i.e., calls to the
proach causes the LLM to stop responding when the chat Python interpreter (”Tool Calls”) and answers given by
size extends the allowed number of tokens, which easily the LLM (”Text Messages”). Nevertheless, for the TSP, we
can occur because the chat includes the entire previous could not find a model that computes all solutions. The
chat history. The resulting chats are included in the pa- iteration was stopped by the LLM, if no changes occurred
per’s appendix and the repository. The implementation or if too many tokens were used. So the paper’s result
can be used to rerun the experiments, and the data can be is that in principle constraint models for well-known
utilized for further investigation in prompt engineering. problems can be computed, however, not in any cases.</p>
      </sec>
      <sec id="sec-4-13">
        <title>An interesting observation is that the generated vari</title>
        <p>5. Discussion able names within the model are always semantically
meaningful (e.g., "queens") and not arbitrary. This likely
Overall, across all problems, after initiating multiple occurs because LLMs statistically favor names that have
chats, each with several iterations, a model that can com- been previously encountered.</p>
      </sec>
      <sec id="sec-4-14">
        <title>In this paper, we use well-known constraint problems,</title>
        <p>Listing 2: Non-deterministic results
- correct
model.arithm(queens[i], "!=" , queens[j] + (j - i)).post()
model.arithm(queens[i], "!=", queens[j] - (j - i)).post()
- correct
model.arithm(qs[i], "-", qs[j], "!=" , j - i).post()
model.arithm(qs[j], "-", qs[i], "!=" , j - i).post()
- wrong
model.abs(queens - queens).ne(i - j).post() # Diagonal constraint
model.abs(queens - queens).ne(j - i).post() # Diagonal constraint
- wrong
model.arithm(queens[i], "-", queens[j], "!=", i - j).post()
model.arithm(queens[i], "+", queens[j], "!=", i + j + 2).post()
- does not find all solutions
diag1 = [model.int_offset_view(queens[i], i) for i in range(n)]
diag2 = [model.int_offset_view(queens[i], -i) for i in range(n)]
model.all_different(diag1).post()
model.all_different(diag2).post()
- wrong
model.arithm(queens[i], "!=" , queens[j]).post() # Different columns
model.arithm(queens[i], "-", i, "!=" , queens[j], "-", j).post() # Different diagonals
(lefttop to right-bottom)
model.arithm(queens[i], "+", i, "!=" , queens[j], "+", j).post() # Different diagonals
(leftbottom to right-top)
- wrong
model.all_different([queens[i], queens[j]]).post()
model.arithm(queens[i], "-", queens[j], "!=" , j - i).post()
model.arithm(queens[i], "+", queens[j], "!=" , j - i).post()
Listing 3: Verifying a solutions for N-queens through testing the queens’ positions.
def is_valid_solution(queens):
n = len(queens)
assert type(queens) == list, "expected list of ints"
for i in range(n):
for j in range(i + 1, n):
# Check if queens are in the same row
if queens[i] == queens[j]:</p>
        <p>return False
# Check diagonals: if the difference between the column indices equals
# the difference between the row indices, they’re in the same diagonal
if abs(queens[i] - queens[j]) == abs(i - j):</p>
        <p>return False
return True
which, of course, can only be a first step for leveraging prevented by providing the API definition in the prompt,
LLMs in knowledge acquisition. The next steps would be because the complete definition extends the number of
to supply new, unknown problems, also industrial ones, possible tokens we could send to the LLM.
e.g., by giving tables of correct variable combinations Additionally, the chats are not stable and can vary
(configurations) in table constraints and generating ab- significantly even with the same prompt. This instability
stracted constraint models. However, Listing 1 shows a persists even with temperature set to 0 and a fixed seed,
potential way of representing formally a problem as a making chats non-deterministic.
query which probably can be used for other tasks. The used prompts should be further developed to</p>
        <p>
          There are several general problems identified: the pro- achieve the intended general generation of knowledge
gram persistently uses the PyChoco API incorrectly and models; in other words, prompt engineering for
knowloften hallucinates PyChoco functions. This could not be edge modeling needs to be elaborated. Few-Shot
Prompting [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ] could be used to encourage the model to produce ernisierung an Mietwohnungsbaubeständen (KIMM)“) as
more consistent outputs and adhere more closely to given well as through the Federal Ministry for Economic Afairs
instructions. and Climate Action (Bundesministerium für Wirtschaft
        </p>
        <p>The approach of first creating syntactically correct und Klimaschutz), Germany, within the “Joint project:
models and then verifying them to ensure semantic ac- EnOB: IntelMOD - Intelligent modernization platform
curacy could be successfully demonstrated (see Listing based on functional cost splitting; sub-project:
Infrastruc3). Because we use a test-driven semantic verification ture of the modernization platform.“ (“Verbundvorhaben:
this verification depends on the quality of the used test EnOB: IntelMOD - Intelligente
Modernisierungsplatcases. For example, the N-queen problem was exhaustive tform auf Basis des Funktionalen Kostensplittings;
Teilbecause of an analytical-based verification and not an projekt: Infrastruktur der Modernisierungsplattform.“)
enumeration of test cases. Furthermore, writing tests
for knowledge-based tasks is, in our experience,
typically easier than writing knowledge models, hence, the References
experts’ tasks of creating formal constraints and also
conifguration models will, in our opinion, shift from writing
models to writing tests - and let us write the models
through an LLM. Thus, the knowledge model
developing process probably, with the presented approach, can
be executed by domain experts not only by knowledge
engineers.</p>
      </sec>
      <sec id="sec-4-15">
        <title>The paper ofers, furthermore, an easy way to begin with iterative LLM prompting. Moreover, it is suggested to use the LLM for creating a new configuration language by exploiting the hallucination capabilities.</title>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>6. Conclusions</title>
      <sec id="sec-5-1">
        <title>The paper presents the first steps in supporting the for</title>
        <p>malization task in constraint modeling with LLMs, i.e.,
the mapping of a well-known problem into a constraint
representation of a certain constraint language, here
Py</p>
      </sec>
      <sec id="sec-5-2">
        <title>Choco, with large language models. By taking various</title>
        <p>constraint problems, formulating generally applicable
prompts, and doing so in an iterative manner which
includes syntactic and semantic verification processes, the</p>
      </sec>
      <sec id="sec-5-3">
        <title>LLM could generate appropriate and correct constraint</title>
        <p>models. Through the iteration, in-between errors related
to syntax and semantics were automatically corrected.</p>
      </sec>
      <sec id="sec-5-4">
        <title>Further work will include a quantitative evaluation of the</title>
        <p>method, a comparison of diferent LLMs, as well as the
generation of constraint models for unknown problems.</p>
      </sec>
      <sec id="sec-5-5">
        <title>In total, this paper provides a first step into a new kind</title>
        <p>of constraint and configuration modeling approach with</p>
      </sec>
      <sec id="sec-5-6">
        <title>LLMs.</title>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <sec id="sec-6-1">
        <title>This project was funded through the Federal Insti</title>
        <p>tute for Research on Building, Urban Afairs and
Spatial Development (Bundesinstitut für Bau-, Stadt- und</p>
      </sec>
      <sec id="sec-6-2">
        <title>Raumforschung), Germany, within the Zukunft Bau</title>
      </sec>
      <sec id="sec-6-3">
        <title>Forschungsförderung Maßnahme (Future of construc</title>
        <p>tion Research funding Measure): “AI-supported
modernization of rental housing stock“ (“KI-gestützte
Mod</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Felfernig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Hotz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bagley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tiihonen</surname>
          </string-name>
          ,
          <article-title>Knowledge-based Configuration -</article-title>
          From Research to Business Cases, Morgan Kaufmann,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Felfernig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Falkner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Benavides</surname>
          </string-name>
          , Feature Models:
          <string-name>
            <surname>AI-driven</surname>
            <given-names>Design</given-names>
          </string-name>
          ,
          <source>Analysis, and Applications</source>
          , Springer,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>L.</given-names>
            <surname>Hotz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Felfernig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Stumptner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ryabokon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bagley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Wolter</surname>
          </string-name>
          ,
          <source>Configuration Knowledge Representation &amp; Reasoning</source>
          , in: A.
          <string-name>
            <surname>Felfernig</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Hotz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Bagley</surname>
          </string-name>
          , J. Tiihonen (Eds.),
          <article-title>Knowledge-based Conifguration -</article-title>
          From Research to Business Cases, Morgan Kaufmann Publishers,
          <year>2014</year>
          , pp.
          <fpage>59</fpage>
          -
          <lpage>96</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>I.</given-names>
            <surname>Ozkaya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Carleton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Robert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Schmidt</surname>
          </string-name>
          ,
          <article-title>Application of large language models (llms) in software engineering: Overblown hype or disruptive change?</article-title>
          , Carnegie Mellon University, Software Engineering Institute's
          <source>Insights (blog)</source>
          ,
          <year>2023</year>
          . URL: https://doi.org/10.58012/6n1p-pw64, accessed:
          <fpage>2024</fpage>
          -May-
          <volume>16</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>B. O</given-names>
            <surname>'Sullivan</surname>
          </string-name>
          ,
          <article-title>Automated modelling and solving in constraint programming</article-title>
          ,
          <source>in: 24th AAAI Conference on Artificial Intelligence</source>
          , AAAI'
          <fpage>10</fpage>
          , AAAI Press,
          <year>2010</year>
          , pp.
          <fpage>1493</fpage>
          -
          <lpage>1497</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>E. C.</given-names>
            <surname>Freuder</surname>
          </string-name>
          ,
          <article-title>Conversational modeling for constraint satisfaction</article-title>
          ,
          <source>AAAI Conference on Artificial Intelligence</source>
          <volume>38</volume>
          (
          <year>2024</year>
          )
          <fpage>22592</fpage>
          -
          <lpage>22597</lpage>
          . URL: https://ojs.aaai.org/index.php/AAAI/article/ view/30268. doi:
          <volume>10</volume>
          .1609/aaai.v38i20.
          <fpage>30268</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>F.</given-names>
            <surname>Petroni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Rocktäschel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. P.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bakhtin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Riedel</surname>
          </string-name>
          ,
          <article-title>Language models as knowledge bases?</article-title>
          ,
          <year>2019</year>
          . URL: https://arxiv.org/pdf/
          <year>1909</year>
          .01066.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Poudel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Zeng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Weninger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Veeramani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bhattacharya</surname>
          </string-name>
          , Entgpt:
          <article-title>Linking generative large language models with knowledge bases, 2024</article-title>
          . URL: https://arxiv.org/abs/2402.06738.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>Razniewski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Yates</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kassner</surname>
          </string-name>
          , G. Weikum,
          <article-title>Language models as or for knowledge bases</article-title>
          ,
          <year>2021</year>
          . URL: https://arxiv.org/abs/2110.04888.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>F.</given-names>
            <surname>Suchanek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. T.</given-names>
            <surname>Luu</surname>
          </string-name>
          ,
          <article-title>Knowledge bases and language models: Complementing forces</article-title>
          ,
          <source>in: 7th International Joint Conference on Rules and Reasoning</source>
          , Springer-Verlag, Berlin, Heidelberg,
          <year>2023</year>
          , pp.
          <fpage>3</fpage>
          -
          <lpage>15</lpage>
          . URL: https://doi.org/10.1007/978-3-
          <fpage>031</fpage>
          -45072-
          <issue>3</issue>
          _
          <fpage>1</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A.</given-names>
            <surname>Nayak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. P.</given-names>
            <surname>Timmapathini</surname>
          </string-name>
          , Llm2kb:
          <article-title>Constructing knowledge bases using instruction tuned context aware large language models</article-title>
          ,
          <year>2023</year>
          . URL: https: //arxiv.org/abs/2308.13207.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>T.</given-names>
            <surname>Ahmed</surname>
          </string-name>
          , S. Choudhury,
          <article-title>Lm4opt: Unveiling the potential of large language models in formulating mathematical optimization problems</article-title>
          ,
          <year>2024</year>
          . URL: https://arxiv.org/abs/2403.01342.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , S. Mao,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          , A. de Wynter,
          <article-title>LLM as a mastermind: A survey of strategic reasoning with large language models</article-title>
          ,
          <year>2023</year>
          . URL: https:// arxiv.org/abs/2404.01230v1. arXiv:
          <volume>2404</volume>
          .
          <fpage>01230</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>W.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. W.</given-names>
            <surname>Cohen</surname>
          </string-name>
          ,
          <article-title>Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks</article-title>
          ,
          <source>Transactions on Machine Learning Research</source>
          (
          <year>2023</year>
          ). URL: https://openreview.net/forum? id=
          <fpage>YfZ4ZPt8zd</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>L.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Albalak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          , Logiclm:
          <article-title>Empowering large language models with symbolic solvers for faithful logical reasoning</article-title>
          ,
          <year>2023</year>
          . arXiv:
          <volume>2305</volume>
          .
          <fpage>12295</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>P.</given-names>
            <surname>Clark</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Tafjord</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Richardson</surname>
          </string-name>
          ,
          <article-title>Transformers as soft reasoners over language</article-title>
          ,
          <source>in: 29th International Joint Conference on Artificial Intelligence, IJCAI'20</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>3882</fpage>
          -
          <lpage>3890</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Schuurmans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Le</surname>
          </string-name>
          ,
          <string-name>
            <surname>E. H.</surname>
          </string-name>
          <article-title>hsin</article-title>
          <string-name>
            <surname>Chi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Zhou</surname>
          </string-name>
          ,
          <article-title>Self-consistency improves chain of thought reasoning in language models</article-title>
          ,
          <source>ArXiv</source>
          (
          <year>2022</year>
          ). URL: https://arxiv.org/abs/2203.11171.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>M.</given-names>
            <surname>Acher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. G.</given-names>
            <surname>Duarte</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.-M. Jézéquel</surname>
          </string-name>
          ,
          <article-title>On programming variability with large language model-based assistant</article-title>
          ,
          <source>in: Proceedings of the 27th ACM International Systems and Software Product Line Conference - Volume A, SPLC '23</source>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2023</year>
          , p.
          <fpage>8</fpage>
          -
          <lpage>14</lpage>
          . URL: https://doi.org/10.1145/3579027.3608972.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>M.</given-names>
            <surname>Funk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hosemann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Jung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lutz</surname>
          </string-name>
          ,
          <article-title>Towards ontology construction with language models</article-title>
          ,
          <year>2023</year>
          . URL: https://arxiv.org/abs/2309.09898.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>T.</given-names>
            <surname>Brown</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ryder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Subbiah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Kaplan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Dhariwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Neelakantan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Shyam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Sastry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Askell</surname>
          </string-name>
          , et al.,
          <article-title>Language models are few-shot learners</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>33</volume>
          (
          <year>2020</year>
          )
          <fpage>1877</fpage>
          -
          <lpage>1901</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>