<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards LLM-based Configuration and Generation of Books</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jovan Mihajlovic</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alexander Felfernig</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Graz University of Technology</institution>
          ,
          <addr-line>Infeldgasse 16b, Graz, 8010</addr-line>
          ,
          <country country="AT">Austria</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2026</year>
      </pub-date>
      <abstract>
        <p>Large Language Models (LLMs) can support a wide range of content generation tasks. Interaction with LLMs can occur either through user-friendly web interfaces or via provided APIs. Our work focuses on content generation for a specific use case: creating lecture books from recorded lectures. To support this goal, a web application with a simple configuration interface has been developed. Users can include transcripts of recordings, configure options, and generate books through the application. It allows for flexibility by ofering diferent prompts and the ability to select among various LLMs. Initial results demonstrate that LLMs can generate books from the recorded lectures, however, evaluation results show varying output quality depending on the selected configuration.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Large Language Models</kwd>
        <kwd>Configuration</kwd>
        <kwd>Generation of Books</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Configuration can be regarded as a specialized form of design activity in which the final product is
assembled from a predefined set of component types, all of which must comply with a corresponding
set of domain-specific constraints [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1, 2, 3, 4</xref>
        ]. In this paper, we demonstrate how Large Language Models
(LLMs) can be leveraged for the automated generation of university lecture books. This generation is
based on a collection of video recordings (of lecture units) [5]. Our application enables the configuration
of key properties relevant to book generation and automatically produces a draft book proposal from
the transcripts of these lectures. The major motivation for our work is to exploit simple ways to provide
students with additional learning contents that help to improve the overall learning experience and to
accelerate learning. The focus of our work is to apply Large Language Models (LLMs) [6] to generate
book contents based on LLM prompts which are themselves generated on the basis of configured book
and generation process properties.
      </p>
      <p>There are multiple ways to interact with LLMs. First, queries can be defined in user interfaces of
LLM providers. Second, models can be accessed via APIs ofered by these providers, allowing the
development of customized applications tailored to specific tasks such as generating book content from
lecture transcripts. Our book generator application serves as an intermediary between the LLM and the
user – it facilitates the process of transforming lecture transcripts into book content. It simplifies the
interaction with LLMs by abstracting the underlying complexity of prompt engineering. Generating
structured book content involves first establishing context and defining the generation goal, then
supplying the transcript text along with relevant options and constraints. While crafting a prompt
manually might be efective for single-use scenarios, repeating this process for multiple transcripts is
labor-intensive. Our system addresses this by using pre-defined prompt templates.</p>
      <p>Due to their generative nature, large language models can be applied in various generation-related
tasks. Related examples in the configuration context are the generation of explanations [ 7] (e.g.,
sustainability-aware explanations nudging configurator users towards more sustainable consumption
patterns), the generation of configuration knowledge bases [ 8] (which helps to reduce eforts in the
context of configuration knowledge base development and maintenance), and interactive configuration
[9] where LLMs can be applied to support interactive chat-based interfaces that support product and
service configuration by allowing users to describe their requirements/preferences in natural language
(and to generate a set of solver-understandable preferences thereof). In contrast to existing work in the
context of combining LLMs with configuration technologies, the work presented in this paper focuses
on the generation of artifacts (in our case, books) on the basis of pre-configured parameters representing
intended book properties and also technical properties relevant for the book generation phase.</p>
      <p>The major contribution of this paper is to present an initial idea of a book generation interface based
on configured parameter settings. Furthermore, we summarize initial insights from the analysis of a
ifrst prototype implementation which is currently not available for productive use.</p>
      <p>The remainder of this paper is organized as follows. Section 2 explains how the implemented web
application automates the book generation workflow. Section 3 discusses the outcomes obtained using
real-world lecture recordings. Finally, Sections 4 and 5 outline directions for future development and
summarize our findings.</p>
    </sec>
    <sec id="sec-2">
      <title>2. LLM-based Book Generator</title>
      <p>The book generator application is structured into a backend, implemented using the NestJS framework1
and a frontend which uses the React2 library. It generates LLM prompts from input data (e.g., transcripts
from lecture videos) and selected (configured) options. The generated prompts are forwarded to the
LLM via API. The input data consists of the following elements:
• one or more lecture video transcript files (textual)
• (optional) a LATEX template file as basis for guiding the LLM-based book generation (in L ATEX
format).</p>
      <p>• (optional) parameters (which prompt to use, book title, chapter size and other parameters)
The user gets a result either as a single LATEX file, or a ZIP file containing multiple output files. There
are three "pipelines" available for the user to choose, these are described in the following. Each pipeline
presents one or more prompts used to generate a book from the input data (see Table 1).</p>
      <sec id="sec-2-1">
        <title>2.1. Transcript to Single Chapter (T2SC)</title>
        <p>This pipeline focuses on generating a single LATEX book chapter for each provided transcript file, which
is derived from the video of an individual lecture unit. The LLM is instructed to create one cohesive
chapter. Structuring elements to be used (e.g., sections and subsections) are enumerated explicitly. To
ensure consistency, each chapter is generated based on the same prompt template.
1https://nestjs.com, accessed 16-June-2025
2https://react.dev, accessed 16-June-2025</p>
        <p>The primary focus of each chapter is to explain the core concepts of the corresponding lecture unit
in detail. Examples can be included if they help in clarifying the discussed concepts. The style in which
examples are presented is not strictly specified. At the end of each chapter, self-evaluation questions
are included to encourage further learning. In the current version of the book generator, the LLM is
explicitly instructed to generate all book contents in English. To assure consistency in formatting, each
generated chapter is generated on the basis of a basic LATEX template.</p>
        <p>The following is the full prompt (transcript contents are formatted/represented as "[...]"):</p>
        <p>An audio transcript of a lecture is provided within the quotes at the end of this message.
Generate a LaTeX chapter for this lecture. The chapter should start with \chapter (you
can use sections, subsections and everything else provided in standard LaTeX). Focus on
explaining the main concepts of the lecture. Use examples if they can help in explaining.
Add questions for self preparation at the end of the chapter. Include some concepts which
are related, but not mentioned in the transcript. The output language should be English.</p>
        <p>The transcript is: [...]</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Concepts to Chapters (C2C)</title>
        <p>Compared to T2SC, the C2C pipeline includes an additional step for generating chapters. Rather than
creating individual chapters directly from each transcript derived from a lecture unit video, the LLM is
ifrst asked to identify the key concepts within the transcripts. Once the key concepts are identified,
the LLM is instructed to enhance the presentation of the content using lists, text styling, structural
elements, and similar formatting techniques. Each concept should be clearly explained to the reader,
with related concepts further elaborated upon in corresponding subsections. As in T2SC, the LLM is
also directed to include relevant examples to aid in the explanation.</p>
        <p>The following is the prompt, where CONCEPT_NAME corresponds to the target concept the prompt
is applied to:</p>
        <p>Within the context of a Computer Science lecture called "Software development
processes", a concept called "CONCEPT_NAME" is present. Write a LaTeX chapter about that
concept. The output language should be English.</p>
        <p>[list of detailed instructions]</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. C2C with Options (C2CO)</title>
        <p>The main diference between C2CO and the previous pipelines is the provision of options (parameters)
that help to further configure the pipeline – see a corresponding example user interface in Figure 1.
These options can be used to further tailor the generated contents.</p>
        <p>The idea of the C2CO pipeline is the same as C2C, however, the used prompts are diferent (as a
direct consequence of the provided options). The probably most influential option is to allow the usage
of a customized LATEX template (see Section 3). Selected options are represented as a list of rules part of
the prompt. For example, the first rule to follow when generating a chapter is the following:
"the chapter starts with introductory paragraph which explains the concept concisely"
Irrelevant elements in provided custom templates (e.g., texts and bibliographic entries from other
articles written on the basis of this template) are first removed by the LLM. The purpose of this step is
to ease further use of the template by the LLM (e.g., texts from other papers have the risk to confuse
the LLM and let the LLM integrate related contents into the generated book). The remaining steps are
quite similar to the C2C pipeline – the major diference are the inserted rules which are defined by the
user via book generator user interface.</p>
        <p>The used prompt is the following:</p>
        <p>Your task is to generate a LaTeX chapter for the following concept (within the context of
a Computer Science lecture called "Software development processes"): "CONCEPT_NAME"
You should follow these rules: [a list of rules like the one stated previously]</p>
        <p>The returned content should be a LaTeX \chapter which can be included in a template.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Preliminary Evaluation</title>
      <p>The following subsections discuss first results of applying our LLM-based approach to video transcripts
generated from a lecture on the topic of software development processes. In this context, each individual
subsection discusses the observed results of applying an individual pipeline. In its current version,
our book generator application lacks automated quality assurance mechanisms. The generation of the
transcripts was performed on the basis of the OpenAI Whisper model.3</p>
      <p>Our initial evaluation of the LLM-generated outputs (books) has been performed manually and the
corresponding results (feedback of two persons) are presented in an aggregated fashion in the following
paragraphs. For evaluation purposes, the generation focused on an individual lecture unit related to
diferent techniques in the context of the topic of software requirements prioritization including subtopics
such as release planning and minimum viable products.</p>
      <p>An overview of the diferent applied LLMs is provided in Table 2.
3https://huggingface.co/openai/whisper-large-v3</p>
      <p>In the following, we summarize the initial evaluation results of the LLM-generated outputs on the
basis of following four evaluation dimensions:
• Understandability of the content
• Completeness: degree to which transcript contents are covered
• Example quality: quality of the included (generated) examples
• Additional content: presence of additional information related to transcript contents, but not
contained in the transcript
3.1. T2SC Pipeline
gemini-2.0-flash The content is clear and easy to understand, and it is provided in the requested
language. The coverage of the transcript content is comprehensive, including a thorough introduction
and an in-depth classification of requirement prioritization approaches (this is the lecture topic). The
examples provided are generally of high quality, with a few that could benefit from improvements in
readability and formatting. Compared to Llama, the quality and quantity of examples are significantly
better. The subsection on additional content lists seven related concepts, each accompanied by a concise
description. However, there is a need for better structure, as the concepts are currently presented in a
single paragraph without clear separation.
llama-3.3-70b-versatile The content is generated in German, despite the instruction specifying
English. Additionally, some sections are a bit unclear—specifically, the subsections on "Basic Release
Planning" and "Integrated Release Planning" both contain identical sentences. Only the latter mentions
additional factors, which suggests there is some intended diference between the two but fails to clarify
it adequately. The overall completeness is in the lower-medium range. Key concepts like "Minimum
Viable Product," which are included in the other model, are missing. The chapter is concise, fitting
onto just two pages, including the self-preparation questions, but lacks depth in certain areas. The
examples provided are of low quality, with only one example given in two short sentences, which fails
to adequately illustrate the concept. This model performed poorly in terms of additional content, as it
did not provide any supplementary material in this area.
3.2. C2C Pipeline
gemini-2.0-flash The C2C pipeline generates multiple chapters along with corresponding sections
and subsections, for which a table of contents can be generated. The content is generally clear and
well-presented. There are some formatting issues with the questions that could hinder readability, but
the questions themselves are understandable.. The transcript is largely covered, but some concepts,
such as basic/integrated release planning (despite the presence of a "Release Planning" chapter) and
utility-based prioritization, are notably missing. The examples, primarily presented in table form, are
easy to understand. A snippet of a subsection with an example is shown in Figure 2. However, many of
other example tables overflow the page, which can make them dificult to read. The additional content
is excellent, with well-organized per-chapter sections and separate chapters that enrich the material.
llama-3.3-70b-versatile The content is clear and easy to understand, with no notable issues. The
transcript is well-covered, though the "Release Planning" chapter lacks references to basic/integrated
release planning, similar to the Gemini model. On the positive side, it does include a chapter on
utility-based prioritization, which the other model does not. The quality of examples is mixed: the
textual example in Chapter 2 is strong, while the one in Chapter 3 is quite brief and doesn’t fully explore
the concept. The remaining examples are decent, though it’s dificult to assess them fully due to tables
spilling out of the page, making the examples incomplete or hard to view. The additional content has a
satisfactory level of depth and organization.
3.3. C2CO Pipeline
gemini-2.0-flash For the chosen template used in evaluation, the title page clearly presents the title
of the book in bold text, with authors directly below. The header on the page following the title page
sufers from broken formatting, though it remains legible with some efort. Otherwise, the content is
generally easy to understand, though there are a few areas where improved formatting could enhance
clarity. The completeness is similar to the C2C case, as the approach is quite comparable, with only
minor diferences in the prompts used for generating the concept chapter. There are numerous examples
presented in table format, all of good quality. Some tables extend beyond the page boundaries so
significant parts of examples are lost. Each chapter references related concepts for additional content,
but the formatting is inconsistent. In some instances, these concepts are listed clearly, while in others,
they are presented continuously in a single paragraph. The prompt could benefit from a more consistent
structure for how additional content should be organized.
llama-3.3-70b-versatile The content is generally clear and easy to follow, though there are some
issues with tables extending beyond the page, similar to the issues found in the Gemini case. The level
of completeness is comparable to the C2C case for the same reasons mentioned in the previous section.
The quality of examples is similar to that of Gemini: some tables are well-structured and present clear
values, while others overflow the page. One example is supposed to demonstrate how to calculate
opportunity costs using the Kano model (as indicated by its title), but it only provides the final values,
without showing the calculations involved. Additional content is provided, but it appears somewhat
brief and/or poorly formatted, which could benefit from further refinement.
Initial Summary The Gemini model shows most consistent performance across diferent dimensions
and pipelines. The approach used by C2C/C2CO pipelines seems to be the right direction, as it allows
more control over structure and length of generated content. Based on our evaluation provided in Table
3, the two analyzed LLMs show major diferences in terms of the evaluation dimensions example quality
and additional content.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Future Work</title>
      <p>The current version of this application provides a solid foundation with consistent results, but there is
considerable room for improvement. The next logical step is to evolve the content generation process
into a more structured, configurable approach that allows for step-by-step refinement.</p>
      <p>By adding more customizable options, we can address a broader range of users. For instance, default
settings would meet the needs of those seeking quick, approximate results, while users who require
greater control can directly adjust specific parameters. Introducing more interactivity between the user
and the model would provide even finer control over the generated content.</p>
      <p>One approach could involve having the model analyze the input data without producing results
immediately. Instead, it could prompt the user for additional options or guidance before proceeding, in
contrast to the current method where the user interacts only once, at the data input stage.</p>
      <p>Another crucial feature currently missing is quality assurance. At present, results are generated by
the model and returned directly to the user without any validation. To address this, we will introduce a
quality-check mechanism either at the end of the process, between various steps, or ideally, both. These
quality checks will provide valuable feedback to users, allowing them to review suggested improvements
or request more detailed revisions.</p>
      <p>An important open issue in this context is also to include mechanisms that help to include information
about information sources used by the LLM – primarily for the purpose of preventing violations of
copyrights and similar issues. Also this aspect has not been taken into account in the current version of
the book generator which is also the reason why the system currently is not applied in productive use.</p>
      <p>Finally, for future versions, we also plan to include the possibility of defining seed knowledge, i.e.,
already proposing a basic structure of the book which is then enriched by the LLM.</p>
      <p>By incorporating these changes, we hope to not only enhance the content generation process but
also to foster greater user involvement and satisfaction.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>The presented application demonstrates the potentials of LLMs in supporting the automated generation
of books where diferent LLMs and prompt configurations produce varying outcomes. This sets the
stage for further development to explore the extent to which these results can be refined. The current
approach leaves room for enhancements, particularly with new features added, opportunities to improve
both, functionality and quality, are given.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>The authors used ChatGPT-4o4 for grammar checking, spellchecking, and improving the formulation of
the text. All AI-generated suggestions were carefully reviewed and edited by the authors, who take full
responsibility for the content of this publication.
[4] A. Popescu, S. Polat-Erdeniz, A. Felfernig, M. Uta, M. Atas, V.-M. Le, K. Pilsl, M. Enzelsberger, T. N. T.</p>
      <p>Tran, An overview of machine learning techniques in constraint solving, J Intell Inf Syst 58 (2022)
91–118. doi:10.1007/s10844-021-00666-5.
[5] S. Lubos, A. Felfernig, D. Garber, V.-M. Le, M. Henrich, R. Willfort, J. Fuchs, Towards Group
Decision Support with LLM-based Meeting Analysis, in: 33rd ACM Conference on User Modeling,
Adaptation and Personalization, UMAP Adjunct ’25, ACM, New York, NY, USA, 2025, pp. 331–335.</p>
      <p>URL: https://doi.org/10.1145/3708319.3733646. doi:10.1145/3708319.3733646.
[6] J. Yang, H. Jin, R. Tang, X. Han, Q. Feng, H. Jiang, S. Zhong, B. Yin, X. Hu, Harnessing the Power of
LLMs in Practice: A Survey on ChatGPT and Beyond, ACM Trans. Knowl. Discov. Data 18 (2024).
doi:10.1145/3649506.
[7] S. Lubos, A. Felfernig, L. Hotz, T. N. T. Tran, S. P. Erdeniz, V. Le, D. Garber, M. E. Mansi, Responsible
Configuration Using LLM-based Sustainability-Aware Explanations, in: É. Vareilles, C. Grosso,
J. M. Horcas, A. Felfernig (Eds.), 26th International Workshop on Configuration (Conf WS 2024),
CEUR-WS.org, 2024, pp. 68–73.
[8] L. Hotz, C. Bähnisch, S. Lubos, A. Felfernig, A. Haag, J. Twiefel, Exploiting Large Language Models
for the Automated Generation of Constraint Satisfaction Problems, in: Conf WS‘24, volume 3812,
CEUR, 2024, pp. 91–100.
[9] P. Kogler, W. Chen, A. Falkner, A. Haselboeck, S. Wallner, Configuration Copilot: Towards
Integrating Large Language Models and Constraints, in: Conf WS‘24, volume 3812, CEUR, 2024, pp.
101–110.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Sabin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Weigel</surname>
          </string-name>
          ,
          <article-title>Product configuration frameworks-a survey</article-title>
          ,
          <source>IEEE Intelligent Systems</source>
          <volume>13</volume>
          (
          <year>1998</year>
          )
          <fpage>42</fpage>
          -
          <lpage>49</lpage>
          . doi:
          <volume>10</volume>
          .1109/5254.708432.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Felfernig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Hotz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bagley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tiihonen</surname>
          </string-name>
          ,
          <source>Knowledge-based Configuration:</source>
          From Research to Business Cases, 1 ed., Morgan Kaufmann Publishers Inc., San Francisco, CA, USA,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Felfernig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Falkner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Benavides</surname>
          </string-name>
          , Feature Models -
          <source>AI Driven Design, Analysis, and Applications</source>
          , Springer, Cham,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>