<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>First International Workshop on Human-AI Collaborative Systems, editors Michele Braccini, Allegra De Filippo,
Michela Milano, Alessandro Safiotti, Mauro Vallati; October</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Impact of LLM-Assisted Coding in Creativity and Robustness of Robot Controllers</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Paolo Baldini</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michele Braccini</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrea Roli</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science and Engineering (DISI), Alma Mater Studiorum - Università di Bologna, Campus of Cesena</institution>
          ,
          <addr-line>Cesena, 47521</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>European Centre for Living Technology</institution>
          ,
          <addr-line>Venice, 30123</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>25</volume>
      <issue>2025</issue>
      <fpage>0000</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>The use of Large Language Models (LLMs) in work environments has recently started to gain attention in the research community, with many works reporting increased productivity and others reporting homogenization of the output products. Nevertheless, their use as coding assistants in robotics has been mostly overlooked, especially from the point of view of their efects compared to not-assisted programming. We claim that peculiar characteristics of robotics programming deserve special attentions, such as the robustness of the produced solution. Here we analyze the efects of using LLMs as coding assistants in robotics. We analyze their efects on the performance, in a pseudo-reality gap, and on the similarity of the produced controllers. We also briefly discuss the feedback of some participants of the experiment. The results suggest that the codes produced with the assistance of LLMs are less robust to unseen conditions, and overall more homogeneous. Additionally, we report a shorter development time when using LLMs, but a poorer coding experience.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;LLM</kwd>
        <kwd>Robotics</kwd>
        <kwd>Development</kwd>
        <kwd>Creativity</kwd>
        <kwd>Robustness</kwd>
        <kwd>Pseudo-Reality Gap</kwd>
        <kwd>High-Level Education</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The release of ChatGPT 3.5 in 2022 quickly revolutionized the world by showing how human work
could be replaced or made more eficient. Among the many areas afected, software development
experienced a major change. The incredible ability of Large Language Models (LLMs) to predict the code
to be written and their seemingly immediate access to a huge amount of information led to massive
speed-ups in code production [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This improvement led to the adoption of LLMs as coding assistant in
many development environments.
      </p>
      <p>
        The sudden interest in this technology captured also the curiosity of researches, who tried to assess
how their specific area of knowledge could be afected by these tools. This led to the proliferation
of works highlighting the positive efect of LLMs in making more efective the work of professionals.
Nevertheless, also some limitation of these systems started to emerge. Specifically, it is of our interest
the perceived homogenization of the output produced with the assistance of LLMs [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]. Many works
reported that LLMs reduce the creativity of humans, leading to the production of similar outputs. Some
works argue that this can lead to a decrease in the novelty of human creations.
      </p>
      <p>One field in which the impact of LLMs in development has been less considered is robotics. Specifically,
to the best of the author knowledge, no work assessed their efects on the performance of the produced
solution against the reality gap and their efect on the creativity of the produced solution. In this work
we perform a preliminary analysis on the efects of using LLMs as coding assistants while programming
robot controllers. We do so by comparing results obtained by the students of a university course in
the creation of controllers for a specific task. Specifically, we explore the efects on the creativity
of the produced solutions (i.e., their structural diversity), their performance, and the robustness to a
pseudo-reality gap1.</p>
      <p>The article is organized as follows. In Section 2 we discuss works analyzing the use of LLMs for
coding, and specifically its use in robotics. Section 3 presents our experiment and the experimental
settings. Section 4 presents and explains the results obtained. In Section 5 we discuss the outcomes of
the experiment and discuss how and why those should be considered when deciding to use LLMs to
develop robot controllers. Finally, Section 6 summarizes the work done and proposes future works.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related works</title>
      <p>
        The use of LLMs in code production have recently started to increase [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]. This widespread interest
caught the attentions of researchers, who started to question how these tools really afect development.
Some works concentrated solely on the performance of LLMs with respect to humans [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Nevertheless,
here we are interested in their efect as coding assistants. Preliminary experiments considered small
groups in context limited in time [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Subsequent works reported the findings in large business and
companies [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ]. The results suggest that LLMs improve the work eficiency and quality of developers.
However, the supervision of humans remains an important aspect for the efective use of these tools.
One limitation of these works is that they mostly focus on the reported perception of the participants,
and not on technical metrics. This could hide important flaws behind the perceived utility. He [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]
analyzes the presence of vulnerabilities in produced code, with and without using LLMs. Their results
suggest that the code produced with the support of LLMs presents more security issues than that
produced without. This afects also the trust that humans have towards LLMs [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Indeed, most
developers trust LLMs mostly for simple tasks such as test generation, trusting it less for what concerns
code development and fix.
      </p>
      <p>
        Multiple works considered LLMs for assisted development, but few investigated their use in robotics.
We notice that the most common approach to the problem of generating code for robots consists in the
generation of high-level plans (i.e., sequences of action commands) [
        <xref ref-type="bibr" rid="ref12 ref13 ref14 ref15">12, 13, 14, 15</xref>
        ]. The motivation is
that most works leverage on existing (or assume the existence of) sets of basic skills. Therefore, the LLM
just need to combine them, allowing the automatic generation of control plans. This approach comes
to face various limitation of the LLMs themselves. For instance, the performance of LLMs tends to
decrease for long texts due to dificulties in efectively using the whole context [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. When considering
the generation of complex code, this obviously becomes a problem. The generation of high-level plans
mitigates this issue by reducing the length of the output required to the LLM. Nevertheless, this approach
just seems to be a workaround, with complex tasks requiring long high-level plans still showing a
decrease in performance [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Another problem that this approach solve is the generation of functioning
code for specific platforms. Indeed, diferent robots perceive and act through devices that produce, and
are controlled, by specific type of signals. LLMs do not often know how to interpret them, and therefore
cannot produce the required code targeting a specific robot. Finally, combining pre-built blocks is a
common approach in robotics to face the so called reality gap [
        <xref ref-type="bibr" rid="ref18 ref19">18, 19</xref>
        ]. Overall, this approach reduces
the need of human developers supervising the code production. This can lead to a speed-up in code
production of up to 90%2, with a consequent reduce in costs [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
      </p>
      <p>
        One of the few works trying to generate robot code at a level lower than a plan is Liang et al.
[
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. The authors create a system that converts human goals to an LLM-generated plan. The plan is
then translated to code, iteratively implementing undefined functions. Also in this case the system
assumes the existence of control primitives, but it has more control over the execution flow and the
code organization. The authors report that the system struggles with commands or goals longer and
more complex than those given as example, highlighting an important weakness of the system.
      </p>
      <p>
        Recently, novel approaches started to emerge as a step forward to classical plan generation for robots.
Antero et al. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] proposes using LLMs not only as code or plan generator, but also as code evaluator.
This methodology sees the contraposition of two diferent LLMs, one for the generation of Finite States
1A pseudo-reality gap simulates passing from simulation to the real robot, without actually using this latter one.
2This assumes that the low-level skills already exists.
      </p>
      <p>
        Machines controllers, and one for their evaluation. The controller is iteratively refined until no error
is detected. This aims at removing completely the burden of code or plan generation from human
developers. Diferently, Schlesinger et al. [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] proposes a system that employs LLMs for the automatic
generation of robot plans and the recovery from errors. The plan produced by the LLM runs until
an error occurs or a human blocks the execution. The LLM then subsequently updates the code to
overcome the detected problem. Also Vemprala et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] proposes the real-time generation of control
plans. However, rather than performing an automatic adaptation of the plan, they convert human
commands to code to be immediately executed by the robot.
      </p>
      <p>Most of the aforementioned works combining LLMs and robotics aims at generating high-level plans.
Additionally, they see the LLMs more as a replacement for human developer rather than assistants. If
the human remains part of the loop, it is mostly not expected to code, but rather to supervise. Here we
argue that humans are still often the core of the robotic development, and thus we see LLMs more like
assistants rather than replacements. Our approach imagines therefore a collaboration of generative
systems and developers through assisted development. In this context, the code is still often produced
from scratch, starting from the implementation of the low-level behaviors (that we argue requiring
a great deal of efort) up to the complete control logic. Therefore, diferently from other works, we
examine a case of full code development, and not just planning.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methods</title>
      <p>This work aims at assessing the efect of LLMs on the development of robot controllers. As multiple
solutions and approaches exist, we need to collect a set of solutions to compare. Therefore, we ask
students of the University of Bologna to implement a controller for a specific task. No student reported
having working experience in robotics, but all attended the course of Intelligent Robotic Systems, which
gives them a wide perspective of development approaches and strategies in robotics. They are thus aware
of the issues that could arise while programming robots, such as noise and the reality gap. However, the
participants of the experiment are not aware of the analyses we perform on the controllers they create.
The experiment involved thirteen male students, which we divided in two groups. Seven participants
program with the assistance of LLMs, while the remaining six program without (see Figure 1). All the
students are allowed to access the course material and codes of laboratories, plus the documentation of
the simulation environment and programming language. Nevertheless, they are not allowed to access
internet except for the aforementioned resources.</p>
      <p>
        The task considered in this experiment is path-following. A robot deployed randomly on an arena
must search for a black path, and then follow it. The robot must keep moving as fast as possible,
avoiding turning on itself. This is a classical robotic task that does not constraints the use of a specific
(a)
(b)
(c)
(d)
(e)
heart, and (c) hexagon paths during development. The (d) train and (e) freehand paths are used in the test arenas.
control architecture. To drive the development, we present the students the evaluation function we use
for assessing the performance of the controller in the experimental analysis section (see Equation 1).
This computes the performance at a specific step, and thus needs to be accumulated through the entire
duration of the experiment to obtain an overall evaluation. The function considers the average color
of the ground, i.e., how often the robot remains on the line, and its direction and average speed. All
these metrics should be maximized, as having just one of those to zero consequently zeroes the step
performance. The step evaluation function is the following:
 
1 ∑︁ (1 − ()) ×
︂(
1
−
| −  | )︂
2
× max
︂(
0,
 +  )︂
2
(1)
where:
•  is the number of ground sensors,
• () is the perception of the ground color from sensor  in [
        <xref ref-type="bibr" rid="ref1">0, 1</xref>
        ], where 0 indicates black
and 1 indicates white,
• ,  are respectively the speed of the left and right wheels, transformed in [
        <xref ref-type="bibr" rid="ref1">−1, 1</xref>
        ].
      </p>
      <p>
        The students program a controller for the Foot-Bot robot [
        <xref ref-type="bibr" rid="ref22 ref23">22, 23</xref>
        ]. We simulate the robot and its
behavior in the ARGoS3 simulator [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. To simplify and accelerate the development, the controller
is implemented in the Lua language [25], which however cannot be directly ported to the physical
robots. This decision is due to the length of the experiment (i.e., three hours) which requires a fast
development of the solution. Additionally, it should simplify the LLM generation, as Lua is a very
permissive language similar to pseudocode.
      </p>
      <p>The language model we consider in this experiment is Copilot, on its version at May 2025 [26].
Specifically, we permit its use through its online interface. This means that it has access only to the
code uploaded by the students. We chose this LLM and this interaction mode for multiple reasons. First,
it is a widespread tool in development. Second, it can be used remotely and without login, avoiding
biases due to previous interactions or the presence of local files. Third, it can provide fast answers (i.e.,
Quick Response) and longer reasoned ones (i.e., Think Deeper), allowing each student to use what they
prefer. Finally, it works even in incognito mode, which we require as an additional measure to avoid
biases in the generated responses. We require the students belonging to the group using LLMs to query
Copilot at least three times during the development.</p>
      <p>Before performing the experiments, we require the students to take a four minutes Divergent
Association Task (DAT) [27]. This aims to measure the verbal creativity and the ability to generate
diverse solutions to open-ended problems [28]. We use the test to divide the students in the two groups,
maintaining as balanced as possible the distribution of the DAT score (see Figure 6). The students are
not aware of the test results and of the logic behind the group subdivision.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>The first metric we use to analyze the results is the performance of the produced controllers (see
Figure 3). We consider the performance on the three arenas presented to the students, and on two
additional arenas (see Figure 2). For each controller, we perform 100 runs per arena, with a limit of 5
minutes of simulation. The group coding without the assistance of the LLM obtained averagely better
results in all the arenas. However, the group using Copilot attained the highest performance in the
arenas presented to the students (i.e., eight, heart, hexagon). This diference in the highest performance
disappears when considering the two test arenas (i.e., the arenas not presented to the students: train
and freehand). Even more interesting are the results obtained in the pseudo-reality gap, that in this
case is, experimenting with much higher noise [29, 30, 31]. In this scenario the performances attained
by the group using Copilot drop, while the performances obtained by the other group remain overall
stable. This indicates a that the controllers produced with the assistance of LLM are less robust than
the controllers produced by students alone.</p>
      <p>
        The second metric we use to analyze the results is the code similarity. This considers the similarity
between pairs of codes produced by the students of each group. For the analysis we use a software
named Dolos, which aims to identify plagiarism in code [32]. This considers the similarity of codes
ignoring comments, variable names and text position. Indeed, these factors concur in obscuring the
results when using diferent metrics, such as the Normalized Compression Distance (NCD) (see Figure 7).
The results show that the code produced with the assistance of Copilot is significantly more similar that
the one produced by the students alone (see Figure 4). This seems to indicate a decrease in creativity
and an overall homogenization of the results, as discovered by previous studies [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>Both the performance and the code similarity seem to be afected by the use of an LLM as a coding
assistant. However, it is important to avoid biases due to the creativity and expertise of the participants
of the experiment. Our two groups are composed of students from the same university course, divided
according to their DAT score. The aim is indeed to minimize diferences between the two test groups.
However, as additional check, we assess how the performance of the produced controllers correlates
with the DAT score. Specifically, for each student, we take the average performance per arena over 100
runs and plot it in a scatter-plot according to the DAT score of the author (see Figure 5). This shows two
interesting aspects. First, the performance seems to increase with the increase in the DAT score. Second,
both groups present few students failing to produce satisfactory solutions, cancelling each other bias.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Discussion</title>
      <p>This works shows the efect of LLMs in programming controllers for robots. Specifically, it highlights
how using Copilot leads to less robust and more specialized code, which performs worse when some
3000
y
it
r
a
il
m
iS 0.3
conditions change. This result is significant, as a well known issue in robotics is maintaining the
performance while porting the code to diferent environments or platforms. Specifically, when this happens
from simulation to real-world, it usually takes the name of reality gap. One of the major challenges has
often been creating control systems robust to conditions diferent from those of development or training.
To this goal, researchers proposed many strategies and architectures [29, 33, 34, 35, 36, 37, 38, 39, 40].
Nevertheless, the poor results obtained while using LLMs seem to indicate that those are not inherently
able to exploit them. This could be due to the reduced availability of state-of-the-art examples present
online. Indeed, most of the code available online is from examples of simulators, while the use of these
architectures and strategies is often limited to minimal illustrations in papers.</p>
      <p>Another diference detected is that using Copilot seems to produce more similar code. This could
be again due to the problem presented in the previous paragraph, i.e. that the LLM does not have a
broad enough knowledge of robotic systems. Alternatively, it could be related to the generally reduced
variability in LLMs responses when compared to humans [41]. On its own, this is not necessarily a
problem. However, the limited variability in answers can lead to fewer options presented to developers
aiming to produce efective controllers.</p>
      <p>Despite limitations, we highlight also that the use of Copilot allowed obtaining working controllers
faster that by humans alone (see Figure 9). Nevertheless, this was not true in all the cases. Some students
reported that asking Copilot directly for a solution led to a long process of fixes that instead elongated
the development time. This led a discrepancy in the students’ feedback on the quality of coding with
LLM, with some students that enjoyed the process and others that did not (see Figure 8). At the end of
the experiment, students reported that the best use they found for Copilot was to ask for a pseudocode
or to brainstorming, but not for the code generation itself (see Table 1).</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>The efect of Large Language Models as coding assistants to program entire robot controllers has been
until now overlooked. In this work we shed a light on the impact they have on the technical and
personal perspectives of a group of developers. Specifically, we find that LLMs afect the performance of
the produced controllers, making them less robust to environmental changes and thus to the reality gap.
Additionally, they tend to constrain the creativity of the developers, leading to the employment of very
similar strategies. Finally, although speeding up the development, the participants of the experiment
reported frustration and overall less enjoyment during coding. We believe these results to be important
for the creation of healthy and efective workplaces in a changing robot industry. Additionally, we
notice that, in order to be efective tackling cutting-edge research problems and development in robotics,
technological advancements of LLMs are still needed.</p>
      <p>One limitation of the current work is the small number of human participants. We plan to perform
additional experiments with a larger group of developers, so to get more statistically robust results.
The next work could also consider multiple tasks and the comparison with a third group of developers
performing pair-programming.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>The authors thank the students of the Intelligent Robotic Systems course of the University of Bologna
who participated in this experiment. Specifically, we thank Emanuele Artegiani, Samuele De Tuglie,
Marco Fontana, Lorenzo Guerrini, Pablo Sebastian Vargas Grateron, and all the other participants who
preferred to remain anonymous.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on generative AI</title>
      <p>During the preparation of this work, the author(s) used Grammarly in order to: grammar and spelling
check, paraphrase, and reword. After using this tool, the author(s) reviewed and edited the content as
needed and take(s) full responsibility for the publication’s content.
multi-engine simulator for multi-robot systems, Swarm Intelligence 6 (2012) 271–295.
[25] R. Ierusalimschy, L. de Figueiredo, W. Filho, Lua–an extensible extension language, Software:</p>
      <p>Practice and Experience 26 (1996) 635–652.
[26] Microsoft, Copilot, https://copilot.microsoft.com, 2023. Accessed 09 May 2025.
[27] J. Olson, M. Webb, S. Cropper, E. Langer, J. Nahas, D. Chmoulevitch, The divergent association
task measures verbal creativity in under 4 minutes, https://www.datcreativity.com, 2019. Accessed
09 May 2025.
[28] J. Olson, J. Nahas, D. Chmoulevitch, S. Cropper, M. Webb, Naming unrelated words predicts
creativity, Proceedings of the National Academy of Sciences: Psychological and Cognitive Sciences
118 (2021) e2022340118.
[29] N. Jakobi, P. Husbands, I. Harvey, Noise and the reality gap: The use of simulation in evolutionary
robotics, in: Advances in Artificial Life, Springer Berlin Heidelberg, 1995, pp. 704–720.
[30] A. Ligot, M. Birattari, On mimicking the efects of the reality gap with simulation-only experiments,
in: Swarm Intelligence, volume 11172, Springer International Publishing, 2018, pp. 109–122.
[31] A. Ligot, M. Birattari, Simulation-only experiments to mimic the efects of the reality gap in the
automatic design of robot swarms, Swarm Intelligence 14 (2020) 1–24.
[32] R. Maertens, C. Van Petegem, N. Strijbol, T. Baeyens, A. Jacobs, P. Dawyndt, B. Mesuere, Dolos:
Language-agnostic plagiarism detection in source code, Journal of Computer Assisted Learning
38 (2022) 1046–1061.
[33] S. Koos, J. Mouret, S. Doncieux, The transferability approach: Crossing the reality gap in
evolutionary robotics, IEEE Transactions on Evolutionary Computation 17 (2013) 122–145.
[34] G. Francesca, M. Brambilla, A. Brutschy, V. Trianni, M. Birattari, AutoMoDe: A novel approach to
the automatic design of control software for robot swarms, Swarm Intelligence 8 (2014) 89–112.
[35] X. Peng, M. Andrychowicz, W. Zaremba, P. Abbeel, Sim-to-real transfer of robotic control with
dynamics randomization, in: 2018 IEEE International Conference on Robotics and Automation
(ICRA), Institute of Electrical and Electronics Engineers, 2018, pp. 3803–3810.
[36] E. Salvato, G. Fenu, E. Medvet, F. Pellegrino, Crossing the reality gap: A survey on sim-to-real
transferability of robot controllers in reinforcement learning, IEEE Access 9 (2021) 153171–153187.
[37] P. Baldini, M. Braccini, A. Roli, Online adaptation of robots controlled by nanowire networks:
A preliminary study, in: Artificial Life and Evolutionary Computation: 16th Italian Workshop,
WIVACE 2022, Gaeta, Italy, September 14–16, 2022, Revised Selected Papers, volume 1780, Springer
Nature Switzerland, 2023, pp. 171–182.
[38] P. Baldini, A. Roli, M. Braccini, On the performance of online adaptation of robots controlled by
nanowire networks, IEEE Access 11 (2023) 144408–144420.
[39] M. Braccini, P. Baldini, , A. Roli, An investigation of graceful degradation in boolean network
robots subject to online adaptation, in: Artificial Life and Evolutionary Computation: 17th Italian
Workshop, WIVACE 2023, Venice, Italy, September 6–8, 2023, Revised Selected Papers, volume
1977, Springer Nature Switzerland, 2024, pp. 202–213.
[40] P. Baldini, M. Braccini, A. Roli, Fault recovery through online adaptation of boolean network
robots, Sensors 25 (2025) 5849.
[41] M. Braccini, G. Aguzzi, P. Baldini, Unraveling creativity through variability: A comparison of</p>
      <p>LLMs and humans in an educational Q&amp;A scenario, Submitted (2025).
0.85
0.80
e
c
itan 0.75
s
D
0.70
0.65</p>
      <p>100
90
e
r
o
sc 80
T
A
D
70
60</p>
      <p>DAT distribution
1
150
s
e
t</p>
      <p>Time to working result
with LLM</p>
      <p>without LLM
Group
4
5
6
7
8
9
10</p>
      <p>Comment
Students complain that code produced by Copilot needs many adjustments.</p>
      <p>Students complain that asking Copilot for diferent solutions and/or fixes always produce similar outputs.
Students complain that asking Copilot for diferent solutions and/or fixes often breaks (or does not work
with) previously generated code, thus requiring manual intervention.</p>
      <p>The Quick-Answer mode of Copilot appears quite inefective in any case, thus resulting useless.
The Think-Deeper mode of Copilot appears quite efective in all the cases.</p>
      <p>Students using Copilot experienced overall less enjoyment during coding.</p>
      <p>Students not using Copilot complain that more efort was needed to produce a working solution, and
therefore they could hardly explore diferent strategies.</p>
      <p>Students state that the code produced by Copilot was often obscure and hard to understand, limiting
possible improvements.</p>
      <p>Students state that Copilot was more useful during the beginning of the development. They state it is
mostly useful to analyze the task characteristics, giving examples, brainstorming, producing pseudocode,
producing sub-tasks, in order to start tackling the problem efectively.</p>
      <p>Some students state that Copilot is useful to improve code produced autonomously; others state that its
suggestions are quite inefective.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C.</given-names>
            <surname>Ziftci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Nikolov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sjövall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Codecasa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kim</surname>
          </string-name>
          , Migrating code at scale with LLMs at Google,
          <year>2025</year>
          . arXiv:
          <volume>2504</volume>
          .
          <fpage>09691</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Doshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Hauser</surname>
          </string-name>
          ,
          <string-name>
            <surname>Generative</surname>
            <given-names>AI</given-names>
          </string-name>
          <article-title>enhances individual creativity but reduces the collective diversity of novel content</article-title>
          ,
          <source>Science Advances</source>
          <volume>10</volume>
          (
          <year>2024</year>
          )
          <article-title>eadn5290</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>B.</given-names>
            <surname>Anderson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kreminski</surname>
          </string-name>
          ,
          <article-title>Homogenization efects of Large Language Models on human creative ideation</article-title>
          , in: C&amp;
          <string-name>
            <surname>C'</surname>
          </string-name>
          <article-title>24:</article-title>
          <source>Proceedings of the 16th Conference on Creativity &amp; Cognition, Association for Computing Machinery</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>413</fpage>
          -
          <lpage>425</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>U.</given-names>
            <surname>Arora</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Garg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Jain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mehta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Oberoi</surname>
          </string-name>
          , Prachi,
          <string-name>
            <given-names>A.</given-names>
            <surname>Raina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Saini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tyagi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <surname>Analyzing</surname>
            <given-names>LLM</given-names>
          </string-name>
          <article-title>usage in an advanced computing class in India</article-title>
          ,
          <source>in: ACE '25: Proceedings of the 27th Australasian Computing Education Conference, Association for Computing Machinery</source>
          ,
          <year>2025</year>
          , pp.
          <fpage>154</fpage>
          -
          <lpage>163</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Nam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Macvean</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Hellendoorn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Vasilescu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Myers</surname>
          </string-name>
          ,
          <article-title>Using an LLM to help with code understanding</article-title>
          ,
          <source>in: ICSE '24: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering</source>
          , Association for Computing Machinery,
          <year>2024</year>
          , pp.
          <volume>97</volume>
          ,
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Licorish</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bajpai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Arora</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Tantithamthavorn</surname>
          </string-name>
          ,
          <article-title>Comparing human and LLM generated code: The jury is still out!</article-title>
          ,
          <year>2025</year>
          . arXiv:
          <volume>2501</volume>
          .
          <fpage>16857</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Hamza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Siemon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Akbar</surname>
          </string-name>
          , T. Rahman,
          <article-title>Human AI collaboration in software engineering: Lessons learned from a hands-on workshop</article-title>
          ,
          <year>2023</year>
          . arXiv:
          <volume>2312</volume>
          .
          <fpage>10620</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Gao</surname>
          </string-name>
          , Research:
          <article-title>Quantifying GitHub Copilot's impact in the enterprise with Accenture</article-title>
          , https://github.blog/news-insights/research/ research-quantifying
          <article-title>-github-copilots-impact-in-the-enterprise-with-</article-title>
          <string-name>
            <surname>accenture</surname>
          </string-name>
          ,
          <year>2024</year>
          . Accessed on May
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>G.</given-names>
            <surname>Bakal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dasdan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Katz</surname>
          </string-name>
          , M. Kaufman, G. Levin,
          <article-title>Experience with GitHub Copilot for developer productivity</article-title>
          at Zoominfo,
          <year>2025</year>
          . arXiv:
          <volume>2501</volume>
          .
          <year>13282v1</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>X.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <article-title>Large Language Models for code writing: Security assessment</article-title>
          , https://medium.com/ @researchgraph/large
          <article-title>-language-models-for-code-writing-security-assessment-</article-title>
          <string-name>
            <surname>f305f9f01ce9</surname>
          </string-name>
          ,
          <year>2024</year>
          . Accessed on May
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>D.</given-names>
            <surname>Khati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Palacio</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y. Zhang,</surname>
          </string-name>
          <article-title>Mapping the trust terrain: LLMs in software engineering - insights</article-title>
          and perspectives,
          <year>2025</year>
          . arXiv:
          <volume>2503</volume>
          .
          <year>13793v1</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>R.</given-names>
            <surname>Wiemann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Terei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Raatz</surname>
          </string-name>
          ,
          <article-title>Large Language Model for assisted robot programming in microassembly</article-title>
          ,
          <source>in: Procedia CIRP: 57th CIRP Conference on Manufacturing Systems 2024 (CMS</source>
          <year>2024</year>
          ), volume
          <volume>130</volume>
          ,
          <string-name>
            <surname>Elsevier</surname>
          </string-name>
          ,
          <year>2024</year>
          , pp.
          <fpage>244</fpage>
          -
          <lpage>249</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>H.</given-names>
            <surname>Luo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wu</surname>
          </string-name>
          , J. Liu,
          <string-name>
            <given-names>M.</given-names>
            <surname>Antwi-Afari</surname>
          </string-name>
          ,
          <article-title>Large language model-based code generation for the control of construction assembly robots: A hierarchical generation approach</article-title>
          ,
          <source>Developments in the Built Environment</source>
          <volume>19</volume>
          (
          <year>2024</year>
          )
          <fpage>100488</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Lucchetti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Schlesinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Saxena</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Freeman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Guha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Biswas</surname>
          </string-name>
          ,
          <article-title>Deploying and evaluating LLMs to program service mobile robots</article-title>
          ,
          <source>IEEE Robotics and Automation Letters</source>
          <volume>9</volume>
          (
          <year>2024</year>
          )
          <fpage>2853</fpage>
          -
          <lpage>2860</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>S.</given-names>
            <surname>Vemprala</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bonatti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bucker</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Kapoor,
          <article-title>ChatGPT for robotics: Design principles and model abilities</article-title>
          ,
          <source>IEEE Access 12</source>
          (
          <year>2024</year>
          )
          <fpage>55682</fpage>
          -
          <lpage>55696</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>T.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Do</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Yue</surname>
          </string-name>
          ,
          <string-name>
            <surname>W.</surname>
          </string-name>
          <article-title>Chen, Long-context LLMs struggle with long in-context learning</article-title>
          ,
          <year>2024</year>
          . arXiv:
          <fpage>2404</fpage>
          .
          <year>02060</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>U.</given-names>
            <surname>Antero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Blanco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Oñativia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Sallé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Sierra</surname>
          </string-name>
          ,
          <article-title>Harnessing the power of Large Language Models for automated code generation and verification</article-title>
          ,
          <source>Robotics</source>
          <volume>13</volume>
          (
          <year>2024</year>
          )
          <fpage>137</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>R.</given-names>
            <surname>Brooks</surname>
          </string-name>
          ,
          <article-title>Artificial life and real robots</article-title>
          ,
          <source>in: Toward a Practice of Autonomous Systems: Proceedings of the First European Conference on Artificial Life</source>
          , The MIT Press,
          <year>1992</year>
          , pp.
          <fpage>3</fpage>
          -
          <lpage>10</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>M.</given-names>
            <surname>Birattari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ligot</surname>
          </string-name>
          , G. Francesca,
          <article-title>AutoMoDe: A Modular Approach to the Automatic Of-Line Design and Fine-Tuning of Control Software for Robot Swarms</article-title>
          , Springer International Publishing,
          <year>2021</year>
          , pp.
          <fpage>73</fpage>
          -
          <lpage>90</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>J.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Hausman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ichter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Florence</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Zeng</surname>
          </string-name>
          ,
          <article-title>Code as policies: Language model programs for embodied control</article-title>
          ,
          <source>in: 2023 IEEE International Conference on Robotics and Automation (ICRA)</source>
          ,
          <source>Institute of Electrical and Electronics Engineers</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>9493</fpage>
          -
          <lpage>9500</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>C.</given-names>
            <surname>Schlesinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Guha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Biswas</surname>
          </string-name>
          ,
          <article-title>Creating and repairing robot programs in open-world domains</article-title>
          ,
          <year>2024</year>
          . arXiv:
          <volume>2410</volume>
          .
          <fpage>18893</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bonani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Longchamp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Magnenat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rétornaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Burnier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Roulet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Vaussard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Bleuler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Mondada</surname>
          </string-name>
          ,
          <article-title>The marXbot, a miniature mobile robot opening new perspectives for the collectiverobotic research</article-title>
          ,
          <source>in: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)</source>
          ,
          <source>Institute of Electrical and Electronics Engineers</source>
          ,
          <year>2010</year>
          , pp.
          <fpage>4187</fpage>
          -
          <lpage>4193</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bonani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rétornaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Magnenat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Bleuler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Mondada</surname>
          </string-name>
          ,
          <source>Physical Interactions in Swarm Robotics: The Hand-Bot Case Study</source>
          , Springer Berlin Heidelberg,
          <year>2013</year>
          , pp.
          <fpage>585</fpage>
          -
          <lpage>595</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>C.</given-names>
            <surname>Pinciroli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Trianni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. O</given-names>
            <surname>'Grady</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Pini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Brutschy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brambilla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Mathews</surname>
          </string-name>
          , E. Ferrante,
          <string-name>
            <given-names>G. Di</given-names>
            <surname>Caro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ducatelle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Birattari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Gambardella</surname>
          </string-name>
          , M. Dorigo,
          <article-title>ARGoS: a modular, parallel,</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>