<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An Early Variant Approach: The Extract, Transform and Execute (ETE) Agent For Research Software Reuse</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Carlos Utrilla Guerrero</string-name>
          <email>carlos.utrilla.guerrero@alumnos.upm.es</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Research Software, Reuse, Intelligent Assistant, Natural Language Processing, Artificial Intelligence</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Delft University of Technology</institution>
          ,
          <addr-line>Delft</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Universidad Politecnica de Madrid</institution>
          ,
          <addr-line>Madrid</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>Automating the interpretation and execution of Research Software (RS) installation procedures is key to minimizing researcher workload while maximizing reusability. This paper presents ETE, a (work-in-progress) Large Language Model (LLM)-based agent designed to automatically support RS reuse activities from documentation. This early variant agent integrates multiple specialized tools and basic reasoning strategies-such as task decomposition and tool selection- to perform distinct tasks, ranging from extracting install-related instructions from README ifles to transforming them into a format that can be executed by machines. This work-in-progress introduces the conceptual approach, problem formulation, architectural design, and a minimal Python prototype exploring the potential suitability of using LLM-powered agents to facilitate RS reuse. While formal evaluation remains future work, this conceptual approach lays the foundation for a family of Artificial Intelligent (AI) agents that aim to bridge the gap between human-generated documentation and automated reuse, a significant step toward accelerating scientific discovery.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        It is widely recognized that documentation accompanying Research Software (RS) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]—such as source
code comments and/or README files—contains valuable human-generated information. This
information (either explains how RS operates or how to install) can be exploited to build research services and
infrastructures with the goal of facilitating software reusability, accelerating researcher productivity,
and reducing associated costs [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        README files commonly detail step-by-step install instructions as part of well-established methods
such as Package Manager, Container, Source Code and/or Setup scripts that aim to facilitate the
installation and execution of RS with minimal friction [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ]. However, encapsulating RS using portable
environments-Docker containers or python package) might impedes the understandability, reuse and
interpretation of the individual components contained therein, or are often outdated, or incomplete,
requiring researchers to visually inspect instructions and follow human-generated procedures [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
These unstructured narratives presents a major obstacle [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] for interpreting and executing instructions
by both humans and machines, ultimately limiting the automation of reuse from documentation across
RS [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>CEUR
Workshop</p>
      <p>ISSN1613-0073</p>
      <p>To address the challenges, we present our Large Language Model (LLM)-based first variant agent
designed to automatically assist researchers in reuse activities, which are to install a given RS
documentation (e.g., GitHub Repository URL), as well as to execute it into a virtual environment. Here we
tackle this challenge by exploring a multidimensional approach encompassing these following tasks: (1)
extracting install-related instructions from README files and (2) transforming them into a format
that can be (3) executed by machines at a minimum cost. In this paper, we introduce ETE agent,
an early, minimal implementation of this conceptual framework, focused specifically on dealing with
installation-relevant information. For demonstration purposes, we illustrate the framework’s workflow
as well as learning strategies using real-world GitHub-hosted repositories. Our exploration with ETE
agent demonstrates the serious possibility of novel eficient learning strategies, powered by tool usage
and reasoning capabilities for reuse-tasks. Through further exploration, we present how ETE agent
deals each stage iteratively, some stages involving structured data transformation via tool-calling, and
others decomposing tasks into subtasks to break complexity down into primitive actions for planning
purposes. We also identify several challenges specific to plan tasks, such as gathering task-relevant
information from README and alternative files. While the efectiveness and suitability of our approach
still require rigorous empirical evaluation, it reveals key challenges and new opportunities in integrating
LLMs to RS reuse problems.</p>
      <p>The remainder of this paper is organized as follows: Section 2 introduces prior research literature,
providing a foundational context for our work. Section 3.1 describes the practical problem we are
interested in addressing and potential solution based on state-of-art. In Section 3.2 we present our
workflow for ETE agent, describing its design details and all the stages involved, including extraction,
transformation and execution. Section 3.3 focuses on the technical implementation of our workflow,
Section 4 illustrates the practical example to be carry out by the agent in the demo, presenting its
limitations and our future lines of work in Section 5. Finally, Section 6 concludes the paper.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Existing related work can be grouped under two major topics: RS reuse and LLMs as Scientific Agents:
• RS Reuse: the benefits of reusing software (e.g., reducing duplication of efort) —are broadly
acknowledged since early days in the software engineering field [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. In spite of its promise, RS reuse
has not become standard practice in RS development yet [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. In light of this persistent challenge,
the RS community has renewed its interest in understanding the barriers to efective RS reuse and
developing strategies to overcome them. Among others, research initiatives such as Codemeta,
SoFAIR, and EVERSE have emerged recently to enhance research software reuse by standardizing
metadata, automating lifecycle management, and promoting software quality.
• LLMs as Scientific Agents: there is a widespread desire in the scientific community to address
the reusability challenge with a fully automatic approach; Recent work has initiated to explore the
power of LLM-based IAs in software engineering tasks [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] and scientific discovery [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Notable
among these are typically repository-level tasks [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] that aim to exploit the vast amount of code
openly available in repositories. Several research projects explore automated solutions to generate
unit test [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], bugs [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] and issue [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] reports with summarization techniques for README files [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]
as well as checking conflicts and libraries vulnerabilities [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. However, a key research challenge
remain insuficiently understood, which is how suitable are LLM-powered agents to assist RS reuse
from documentation.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Automating RS Reuse via Documentation</title>
      <p>This section presents our conceptual approach for the automatic reuse of RS based on available
documentation. We begin by defining the specific problem our work aims to address, along with a potential
solution (Section 3.1). Next, we provide an overview of the proposed conceptual approach 3.2, followed
by a detailed description of the core components and implementation of the framework (Sections 3 and
Section 3.3, respectively). The overall process is depicted in Figure 1.</p>
      <sec id="sec-3-1">
        <title>3.1. Problem Statement</title>
        <p>We define the machine problem of automatic reuse of Research Software (RS) as follows: given access
to openly available RS documentation (e.g., a URL to a public Git repository), what actions should
a system—such as an artificial agent—perform to automatically convert human-generated
installation instructions into actionable, machine-executable commands, and execute them within a virtual
environment?</p>
        <p>
          Unlike prior work that addresses isolated challenges or narrow tasks, our objective is to explore
how to develop a unified solution covering the full range of problems summarized in Table 1. A
robust solution to enable RS reuse at scale would need to proceed as follows: to overcome the lack of
standardization in RS documentation, including inconsistent machinery install-instructions (P1 and P2),
an agent first would need to extract all software metadata and alternative installation methods described
in the README files (and other files), then transforms each method’s sequence of steps into a structured
JSON format. To tackle P3 (e.g., automation in managing environments and configurations) and P4
(e.g., automated solutions for execution of RS), the community utilises continuous and development
solutions; however these are relying on permission and might not be fully automated. Therefore, an
agent would need to provide detailed thinking process (e.g., adopting the approach introduced in our
previous work [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]: the PlanStep in which agent first breaks down a complex install methods found
in a README file such ”from source” into several subtasks, and then plan for each subtask in a fixed,
sequential order) before generating the two targeted outputs: i) an isolated environment (e.g., via
Docker or virtual environments) and another to configure and install the RS within that environment.
        </p>
        <p>To our knowledge, this is the first integrated Extraction–Transformation–Execution ( ETE) conceptual
framework for automated RS reuse using LLM-based agents.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Proposed ETE Agent: An Overview</title>
        <p>In this section, we introduce the ETE agent, a high-level, LLM-powered agent designed to automate
the reuse of Research Software (RS) from documentation, specifically the installation and execution
instructions from open-source RS repositories. We briefly describe our proposed approach, covering
all the stages involved in our workflow. We also describe the agent’s potential reasoning capabilities
and how ETE interacts autonomously with a suite of discrete, function-based tools1 to accomplish
each phase of the workflow. These stages are briefly summarized in the following subsections and are
depicted in Figure 1.</p>
        <sec id="sec-3-2-1">
          <title>3.2.1. Stage 1: Extract</title>
          <p>Provided that a RS path (URL of a git repository) is given, the ETE agent first retrieves all install-specific
information from the README file—gather fields covering the set of statements that determine install
methods, procedural steps, instructions, and other key characteristics required for configuring, setting
and executing a RS. Additionally, the agent clones the repository and extracts other project-specific
metadata relevant to our context (e.g., Name, Programming Language, Executable and Usage examples).
With the environment all set, ETE agent sends a query to the a LLM, which generates a JSON output
(see Figure 3) following the CodeMeta schema as shown in Figure 2. The corresponding LLM-generated
output is then validated and constrained into a predefined JSON schema. We chose a widely used
standards such as CodeMeta properties to represent the install-specific categories from README files
using Pydantic Library as its the most popular Python library for performing data validations, ensuring
strict adherence to the expected format.</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>3.2.2. Stage 2: Transform</title>
          <p>Once Extract stage (Stage 1) is completed, the initial ETE-generated output (e.g., Installation Instructions
JSON file) is fed into our Stage 2, with the goal of transforming it into machine-executable files ready to
be executed by a machine. Specifically, the agent must:
1. Compare and analyse comprehensively the installation-relevant information from Installation
Instructions JSON file and any other supplementary with information about dependencies and
environmental requirements, among others: .yaml, .config and pyproject.toml.
2. Generate an actionable plan, step-by-step installation plan that maps each step to its prerequisites,
orderly installation steps with instructions, dependency information and compatible Operating
System (see Figure 5).
1Mimicking the steps that a human tasked at the same activity would need to perform such as exploring documentation,
setting up the environment, following instructions based on operating systems, and executing commands in a terminal.</p>
          <p>JSON Schema for Installation Instructions (Summarization Data validation)
1 c l a s s ReadmeAnalysisContent ( BaseModel ) :
2 ”””Model f o r parsed README. ”””
3 methods : L i s t [ s t r ] = Field ( default_factory=l i s t ,
4 d e s c r i p t i o n=” L i s t of i n s t a l l a t i o n methods”
5 installation_instructions_per_method : L i s t [ I n s t a l l S t e p ] =
6 Field ( default_factory=l i s t ,
7 d e s c r i p t i o n=” L i s t of i n s t a l l a t i o n and usage commands with metadata” )
8 important_links : L i s t [ s t r ] = Field (
9 default_factory=l i s t , d e s c r i p t i o n=” L i s t of relevant documentation l i n k s ” )</p>
          <p>3. Transform the plan into executable commands and output them as a Dockerfile and shell script
(install.sh) suitable for automated execution (depicted in Figure 6 and Figure 7, respectively).</p>
          <p>Installation Instructions (Summarization response in JSON)
1 { ” repo ” : ”Darwin Godel Machine (DFM)” ,
2 ”methods” : [ ”Docker” , ”Package Manager” ] ,
3 ” p r e r e q u i s i t e s ” : [ ”Docker” , ”Python” , ” pip ” ] ,
4 ” operating_system ” : ”Linux” ,
5 ” installation_instructions_per_method ” : [
6 { ”method” : ”Docker” , ” order ” : 1 ,
7 ” i n s t r u c t i o n ” : ” Verify Docker c o n f i g u r a t i o n ” ,
8 ”commands” : [ ” docker run hello - world” ]
9 } , ( . . . )
10 { ”method” : ”Package Manager” , ” order ” : 1 ,
11 ( . . . ) } ,
12 ”environment” : [ ”OPENAI_API_KEY” ,
13 ”ANTHROPIC_API_KEY” ] ,
14 ” important_links ” : [
15 ” https : // github . com/ jennyzzt /dgm/blob/main/LICENSE” ,
16 ” https : //sakana . a i /dgm/” ] }
Figure 3: Example of an ETE-generated response install-related extraction from DFM repository in Stage (1) - Extract.
This JSON file contains README install-specific information organised in: two distinct plans, precise sequence of install
steps with commands, list of dependences (and pre-requirements (setup API keys) as well as important links.</p>
          <p>Installation Plan (Summarization reasoning response in JSON)
install approaches were chosen over alternatives (see Figure 4).</p>
        </sec>
        <sec id="sec-3-2-3">
          <title>3.2.3. Stage 3: Execute</title>
          <p>Building on the outputs generated in Stage 2, the third stage of the ETE agent is responsible for
executing the resulting files. Thus, this stage focuses on two primary objectives:
• Locating the files: Verifying the presence of all required outputs, including the Dockerfile , install.sh
script, and associated JSON files.
• Executing the files: Running the commands specified in the executable artifacts to automate the
environment setup and RS installation process.</p>
          <p>During this stage, the agent interacts with an LLM by sending structured prompts, processing
responses, and executing tool calls dynamically. If all objectives are successfully met, the ETE agent
returns a completion report indicating a successful installation. Otherwise, the agent invokes available
tools (e.g., writer tool, web searcher or terminal tool), formulates a hypothesis to guide error resolution,
generates a revised prompt for repairing the issue, and writes and re-executes a new installation script
(see output generated by ETE agent in Figure 8).</p>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Implementation Details</title>
        <p>This section describes the formulation of prompting approaches as well as the models and tool usage
taken into account for our ETE agent exploration.</p>
        <sec id="sec-3-3-1">
          <title>3.3.1. Prompting Approaches</title>
          <p>Our ETE agent employs a widely-used guidelines and standards to compliance with protocols of the
GROQ reasoning services for prompt comprising: agent role (e.g., defines primary task of the agent),
tasks (e.g., outlines specific objectives), examples ( if applicable, simple examples in the prompt to enact
the expected format) and available tools (if applicable, the functions that are to be used by the agent).
Additionally, we explore diferent settings, depending on the stages (see Figure 9):
1. Extraction: we implement few-shot prompting with concrete examples.
2. Transform: we employ zero-shot instructions with constraints for the reasoning approach.</p>
        </sec>
        <sec id="sec-3-3-2">
          <title>Step 3 – Installation Error Diagnosed</title>
          <p>&lt;THINK&gt;
The installation process failed due to a missing requirements.txt. According to the protocol, when an installation error occurs,
the next action is to generate a search prompt based on the error details (STEP 4). If Error Occurs, the next logical step is to invoke
the tool formulate\_error\_search\_prompt. Tool Call:
1 {”name” : ” execute_tool_in_terminal ” ,
2 ”arguments” : {
3 ”name” : ” formulate_error_search_prompt ” ,
4 ”arguments” : {
5 ” e r r o r _ f i l e ” : ” execution_agent_workspace/ outputs / i n s t a l l a t i o n _ e r r o r s . l o g ”}}}
&lt;ACTION&gt;</p>
          <p>3. Execute: Each step in Stage 3 follows a ReAct -oriented approach as proposed by Yao et al., 2022,
where LLMs generate both reasoning traces and actions.
3.3.2. Models</p>
        </sec>
        <sec id="sec-3-3-3">
          <title>3.3.3. Tool Usage</title>
          <p>
            We chose the DeepSeek-R1-Distill-Llama-70B model2 via the GROQ API endpoints with a
temperature of 0.3 to balance creativity and accuracy. DeepSeek-R1 family models tend to work well when
reasoning capabilities and agent tool approaches are required in the agents [
            <xref ref-type="bibr" rid="ref20">20</xref>
            ].
          </p>
          <p>To support ETE agent undertaken its tasks, we equip with a suite of tools implemented as executable
Python functions. These tools aims to enhance the ETE agent’s capabilities across the ETE workflow
entirely. At the time of writing this paper, however, the following set of tools is only supported in stage
2 and 3:
• Terminal: It executes shell commands within a Linux environment.
• Reasoner: It reads initial installation plans, selects an optimal path with ordered steps, and outputs
reasoning traces in JSON format.
2https://console.groq.com/docs/model/deepseek-r1-distill-llama-70b
• Reader and Writer: Handles file operations in bash. Given a file path and content, the tools either
read from or write to the target file.
• Web Searcher: Issues web queries to collect best practices related to container configuration and
integrates relevant results into installation recommendations during validation.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. ETE Agent Demonstration</title>
      <p>For our demonstration, we will present ETE agent from a conceptual perspective and demonstrate its
potential suitability for assisting in reuse-oriented tasks, as previously discussed. The proposed ETE
agent is publicly available at https://github.com/carlosug/agent.rse.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Limitations and Future Work</title>
      <p>In this paper we introduce the first variant of the ETE agent as a conceptual framework, still in its early
stage of exploration and development. We acknowledge that our work has limitations summarised as
follows:
• RS Documentation Complexity Our limited solution performs well on well-documented, widely
used research software (RS), particularly when relying exclusively on README files. The suitability
of our approach to non-standardised RS documentation beyond README files remains insuficiently
studied. To address this, we plan to mine RS repositories to analyze the diversity of installation
methods. This will characterise how RS complexity afects agent performance at scale.
• Need for Systematic Evaluation The ETE agent leverages emerging LLM capabilities such as tool
calling, though its robustness in long-term planning is still unproven. A key limitation is the lack of
benchmarks for RS reuse-tasks. Future work will focus on building a more sophisticated prototype
and developing methodologies for accurately benchmarking RS reuse in realistic scenarios.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>In this work, we presented the ETE agent, our first variant agent designed to automate the reuse of
research software (RS) by interpreting installation and execution instructions from open-source RS
repositories. The agent leverages LLMs, alongside specialized tools and reasoning capabilities, to extract
install-related instructions from README files, and convert them into machine-executable formats
with minimal manual intervention. This work marks a first step toward a family of ETE-derived agents
capable of autonomously supporting RS reuse—a key challenge in accelerating scientific discovery.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>This work is supported by the Ontology Engineering Group (OEG) under the PhD in Artificial
Intelligence Program with Universidad Politécnica de Madrid, and through the support of the research team
supervisor Dr. Daniel Garijo. The authors would also like to acknowledge TU Delft University.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the author(s) used ChatGPT and in order to: Grammar and spelling
check.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Barker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. P.</given-names>
            <surname>Chue Hong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Katz</surname>
          </string-name>
          , et al.,
          <article-title>Introducing the FAIR Principles for research software</article-title>
          ,
          <source>Scientific Data</source>
          <volume>9</volume>
          (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .1038/s41597-022-01710-x.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>P.</given-names>
            <surname>Abate</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. Di</given-names>
            <surname>Cosmo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Gesbert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Le</surname>
          </string-name>
          <string-name>
            <surname>Fessant</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Treinen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zacchiroli</surname>
          </string-name>
          ,
          <article-title>Mining component repositories for installability issues</article-title>
          ,
          <source>in: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, IEEE</source>
          ,
          <year>2015</year>
          , pp.
          <fpage>24</fpage>
          -
          <lpage>33</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>C.</given-names>
            <surname>Utrilla Guerrero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Corcho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Garijo</surname>
          </string-name>
          ,
          <source>Automated Extraction of Research Software Installation Instructions from README Files: An Initial Analysis, Lecture Notes in Computer Science 14770 LNAI</source>
          (
          <year>2024</year>
          )
          <fpage>114</fpage>
          -
          <lpage>133</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -65794-
          <issue>8</issue>
          _
          <fpage>8</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>H.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Treude</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zahedi</surname>
          </string-name>
          ,
          <article-title>Adapting installation instructions in rapidly evolving software ecosystems</article-title>
          ,
          <source>IEEE Transactions on Software Engineering</source>
          (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Hermann</surname>
          </string-name>
          , J. Fehr, Documenting research software in engineering science,
          <source>Scientific Reports</source>
          <volume>12</volume>
          (
          <year>2022</year>
          ).
          <source>doi:10.1038/s41598-022-10376-9.</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>L.</given-names>
            <surname>Salerno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Treude</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Thongtatunam</surname>
          </string-name>
          ,
          <article-title>Open source software development tool installation: Challenges and strategies for novice developers</article-title>
          ,
          <source>arXiv preprint arXiv:2404.14637</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Yuan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <article-title>Easytool: Enhancing llm-based agents with concise tool instruction</article-title>
          ,
          <source>arXiv preprint arXiv:2401.06201</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>P.</given-names>
            <surname>Naur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Randell</surname>
          </string-name>
          ,
          <article-title>Software engineering: Report on a conference by the nato science commitee</article-title>
          ,
          <source>NATO Scientific Afairs Division</source>
          , Brüssel, (
          <year>1968</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>C.</given-names>
            <surname>Goodwin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Woolley</surname>
          </string-name>
          ,
          <article-title>Barriers to device longevity and reuse: A vintage device empirical study</article-title>
          ,
          <source>Journal of Systems and Software</source>
          <volume>211</volume>
          (
          <year>2024</year>
          )
          <fpage>111991</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Pezzè</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Abrahão</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Penzenstadler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Poshyvanyk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Roychoudhury</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Yue</surname>
          </string-name>
          ,
          <article-title>A 2030 roadmap for software engineering</article-title>
          ,
          <source>ACM Transactions on Software Engineering and Methodology</source>
          (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ghafarollahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Buehler</surname>
          </string-name>
          , Sciagents:
          <article-title>Automating scientific discovery through multi-agent intelligent graph reasoning</article-title>
          ,
          <source>arXiv preprint arXiv:2409.05556</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>R.</given-names>
            <surname>Bairi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sonwane</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kanade</surname>
          </string-name>
          , V.
          <string-name>
            <surname>D. C,</surname>
          </string-name>
          <article-title>A</article-title>
          . Iyer,
          <string-name>
            <given-names>S.</given-names>
            <surname>Parthasarathy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rajamani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ashok</surname>
          </string-name>
          , S. Shet,
          <article-title>CodePlan: Repository-level Coding using LLMs and Planning</article-title>
          ,
          <source>arXiv preprint arXiv:2309.12499</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>B.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Vendome</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Linares-Vasquez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Poshyvanyk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. A.</given-names>
            <surname>Kraft</surname>
          </string-name>
          , Automatically Documenting Unit Test Cases,
          <source>in: Proceedings of IEEE International Conference on Software Testing, Verification and Validation</source>
          ,
          <string-name>
            <surname>ICST</surname>
          </string-name>
          <year>2016</year>
          (
          <year>2016</year>
          )
          <fpage>341</fpage>
          -
          <lpage>352</lpage>
          . doi:
          <volume>10</volume>
          .1109/ICST.
          <year>2016</year>
          .
          <volume>30</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>S.</given-names>
            <surname>Rastkar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. C.</given-names>
            <surname>Murphy</surname>
          </string-name>
          , G. Murray,
          <article-title>Automatic summarization of bug reports</article-title>
          ,
          <source>IEEE Transactions on Software Engineering</source>
          <volume>40</volume>
          (
          <year>2014</year>
          )
          <fpage>366</fpage>
          -
          <lpage>380</lpage>
          . doi:
          <volume>10</volume>
          .1109/TSE.
          <year>2013</year>
          .
          <volume>2297712</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>G.</given-names>
            <surname>Sridhara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Hill</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Muppaneni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Pollock</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Vijay-Shanker</surname>
          </string-name>
          ,
          <article-title>Towards automatically generating summary comments for Java methods</article-title>
          ,
          <source>in: Proceedings of the IEEE/ACM International Conference on Automated Software Engineering</source>
          (
          <year>2010</year>
          )
          <fpage>43</fpage>
          -
          <lpage>52</lpage>
          . doi:
          <volume>10</volume>
          .1145/1858996.1859006.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>H.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Treude</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zahedi</surname>
          </string-name>
          ,
          <article-title>Evaluating Transfer Learning for Simplifying GitHub READMEs</article-title>
          , in
          <source>: Proceedings of the 31st ACM Joint European Software Engineering Conference</source>
          ,
          <year>2023</year>
          . doi:
          <volume>10</volume>
          . 1145/3611643.3616291.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>H. O.</given-names>
            <surname>Delicheh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Decan</surname>
          </string-name>
          , T. Mens,
          <article-title>Quantifying Security Issues in Reusable JavaScript Actions in GitHub Workflows</article-title>
          ,
          <source>in: Proceedings of IEEE/ACM 21st International Conference on Mining Software Repositories</source>
          ,
          <string-name>
            <surname>MSR</surname>
          </string-name>
          <year>2024</year>
          (
          <year>2024</year>
          )
          <fpage>692</fpage>
          -
          <lpage>703</lpage>
          . doi:
          <volume>10</volume>
          .1145/3643991.3644899.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>A.</given-names>
            <surname>Mao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Garijo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Fakhraei</surname>
          </string-name>
          ,
          <article-title>Somef: A framework for capturing scientific software metadata from its documentation</article-title>
          ,
          <source>in: 2019 IEEE International Conference on Big Data (Big Data)</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>3032</fpage>
          -
          <lpage>3037</lpage>
          . doi:
          <volume>10</volume>
          .1109/BigData47090.
          <year>2019</year>
          .
          <volume>9006447</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>I.</given-names>
            <surname>Bouzenia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Devanbu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pradel</surname>
          </string-name>
          ,
          <string-name>
            <surname>Repairagent:</surname>
          </string-name>
          <article-title>An autonomous, llm-based agent for program repair</article-title>
          ,
          <source>arXiv preprint arXiv:2403.17134</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>D.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , J. Song,
          <string-name>
            <given-names>R.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Zhu</surname>
          </string-name>
          , S. Ma,
          <string-name>
            <given-names>P.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Bi</surname>
          </string-name>
          , et al.,
          <string-name>
            <surname>Deepseek-</surname>
          </string-name>
          r1:
          <article-title>Incentivizing reasoning capability in llms via reinforcement learning</article-title>
          ,
          <source>arXiv preprint arXiv:2501.12948</source>
          (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>