<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Using Large Language Models to Support Software Engineering Documentation in Waterfall Life Cycles: Are We There Yet?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>AntonioDella Port</string-name>
          <email>adellaporta@unisa</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>VincenzoDe Martin</string-name>
          <email>demartino@unisa.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>GilbertoRecupito</string-name>
          <email>recupito@unisa</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>CarmineIemmino</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Large Language Model, Artificial Intelligence for Software Engineering, ChatGPT,</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Gemma Catolin</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>SeSa Lab - Università Degli Studi di Salerno</institution>
          ,
          <addr-line>Via Giovanni Paolo II, 132, 84084 Fisciano, Salerno</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>1860</year>
      </pub-date>
      <fpage>0000</fpage>
      <lpage>0003</lpage>
      <abstract>
        <p>Software documentation is key for producing high-quality projects and ensuring their smooth evolution. Nonetheless, the activity of writing software artifacts is time-consuming and efort-prone. Looking at the existing body of knowledge, we outline limited evidence of how automated approaches may support practitioners when documenting the artifacts produced throughout the software lifecycle. In particular, there is still a lack of investigations into the capabilities of Large Language Models (LLMs), which are indeed supposed to be highly beneficial in this respect. In this paper, we propose a preliminary case study to understand how LLMs can support the development of the documentation of projects developed through a Waterfall lifecycle. Using ChatGPT, we engineered specific prompts to generate and validate the artifacts produced, taking an existing, documented software engineering project as an oracle. The main findings of the study show the ability of ChatGPT to produce most artifacts correctly. In addition, we find that software engineers would require a relatively low efort to adapt the outputs provided by ChatGPT to their own context, especially for textual artifacts.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>code analysis, generate code, and predict b3u]g.sTh[ese
Integrating Large Language Models (LLMs) into vwarair-e engineering tasks, especially considering software
ous domains has recently garnered significant attentdioevne.lopment and maintenance activit5i]e. sH[owever,
Recent statistics indicate that ChatGPT, a prominenottheexr- software engineering tasks, especially those related
ample of LLM, has gathered over 180 million users, utno-documentation, are still defined as key challen6]g.es [
derscoring the widespread adoption of such mo1d]e. lsS[ince there is a lack of studies in this specific field, we aim
LLMs showcase a remarkable versatility, particulartloy ipnrovide preliminary results to show the capabilities of
software engineering 2[], thus leading practitioners taon LLM to tackle the challenge of crafting software
docwonder how these models can efectively replicate thuemirentation. We selected a Waterfall Life Cycle project
tasks. From here, there is a need to explore their potentotiaexlplore LLMs’ documentation abilities across
develwithin the Software Development Lifecycle (SDLC). Inopment phases, from requirements to technical details.
particular, the literature showed how LLMs can simuTlhartoeugh this preliminary case study, we employed
Chatteam members in a development environment, perforGmPT 1 to generate documentation artifacts.</p>
      <sec id="sec-2-1">
        <title>We aim to evaluate ChatGPT’s real-world eficiency</title>
        <p>AI-powered systems can analyze large amounts of cobdyecomparing it to a benchmark project and gauging the
and data quickly and accurately, enabling automaetfoirotnto produce similarly high-quality artifacts.
Prelimiof repetitive tasks and allowing developers to focunsaroynfindings suggest ChatGPT eases documentation and
more complex issues 4[].</p>
      </sec>
      <sec id="sec-2-2">
        <title>These benefits allowed us to resolve key issues in soft</title>
        <p>outcomes.</p>
        <p>speeds up design replication but requires human input
for response refinement and query tuning. Initial
integration eforts are moderate, but some artifacts necessitated
revised prompts and external software for satisfactory
Artificial Intelligence for Software Engineering (AI4SE)
is a well-known research area that aims to develop AI
solutions and SE practices to improve software develo3p.- Research Method
ment processes and tool7s, [8]. With the emergence and
proliferation of LLMs, this field has encountered nTehwe goal of the study was to determine to what extent
opportunities to support and streamline the laborLsLMofs can support the activities of a software engineer
software engineers and researcher5s].[ when writing documentation in a software project
em</p>
        <p>In the vein of such advancements, De Vito et9]al.p[loying the Waterfall Life Cycle model, witphurtphoese
introduced ECHO, an innovative method utilizing LLoMfsproviding software engineers elements that can be
to aid software engineers in improving the quality loefveraged to support and improve the design process of
UML use cases. Further extending the utility of AsIoiftwnare projects. Theperspective is of both researchers
SE, De Vito et al.10[] a chatbot designed for softwareand practitioners. The former are interested in
underengineering, streamlining tasks like code review, teststinagn,ding the current potential and limitations of using
and criteria evaluation. LLMs for documentation tasks, possibly identifying
op</p>
        <p>Ahmad et al.1[1] explore the role of ChatGPT as a bopotrtunities for further research and improvement. The
in collaborative software architecting to support thelaantatle-r are interested in assessing how LLMs can act as
ysis, synthesis, and evaluation of microservices-badseodcumentation assistants in practice, verifying whether
software. A study by Liang et al1.2[] surveyed develop- these models may be employed in real-world contexts
ers’ perceptions, noting issues like code not meetingarned- potentially integrating them into their workflow.
quirements. Despite these advancements, the domain of
AI-assisted documentation in SE remains underexplor3e.d1,. Research Question
especially the comprehensive support for the entire
documentation lifecycle. Our research question aimed to understand whether</p>
        <p>As Robillard et a1l.3][ highlighted, traditional docLuM-Ms can substantially support the software
documenmentation practices are ineficient because of the matna-tion activities developed using a Waterfall Life model.
ual nature of its creation and the gap between creUantdoerrsstanding how documentation writing activities
usand consumers. Aghajani et a1l4.][reported that doci-ng LLM can improve artifacts and possibly reduce efort
umentation sufers numerous shortcomings and probw-ould be crucial. We chose ChatGPT because of its
populems, including insuficient and inadequate content alnadrity and availability, in line with similar st23u,d1i1e]s. [
outdated and ambiguous information. Recent investigIan-this context, we formulated the following research
tions have further explored the extent to which qLLuMesstion.
can assist in tasks like writing co15d]e, c[onducting code
reviews 1[6], providing code explanation17s][, and teach- RQ1. To what extent can ChatGPT support software
ing programming concepts18[]. These studies suggest engineering documentation tasks in a Waterfall Life Cycle
the potentiality of LLMs to create significant supportmiondel?
the activities involved in the SDLC and focus the humaTno address our research question, we conducted a
preefort on the quality and relevance of the results. liminary case study24[] using an oracle project and
com</p>
        <p>White et al1.9[] emphasized the importance of promptparing it to the output of the LLM to provide insights into
engineering to guide LLMs by presenting a catalogunofderstanding its usefulness for documentation tasks.
patterns to dialogue with LLMs to achieve satisfaWcteofroyllowed the guidelines by Wohlin et25a]la.n[d the
outputs. A well-written prompt enables correct ansAwCeMrs/SIGSOFT Empirical Standardfsor the repor2t.
by minimizing prompts [20, 21, 22]. Our work builds on
these studies, exploiting how to use prompts to supp3o.r2t. Context of the Study
documentation artifacts.</p>
        <p>Our research is motivated by the goal of comprehTeona-ddress the goal of our work and provide preliminary
sively understanding how ChatGPT can support bothisntsuig-hts into the capabilities of ChatGPT for
documentadents and practitioners during the software developmetniotn tasks, we selected a project naRmoejdina Review, a
lifecycle, focusing on creating improved documentatwieobn-based platform for news and reviews of video games.
of software systems. We aim to shed light on the roTlehis project has 100k lines of code and was initially
deof ChatGPT and LLMs in simplifying the developmenvetloped by a team of three software engineering students
process and assess the complexities involved in usiantgour university using a Waterfall lifecycle. On the one
ChatGPT to produce high-quality results. hand, we selected a fully developed proji.eec.,tw,ith the
full set of artifacts already developed toghroauvneda
truth against which to assess the capabilities of ChatGPT.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2Available ahtttps://github.com/acmsigsoft/EmpiricalStand.aWreds</title>
        <p>leveraged the guidelines availabl“eGfeonreral Standaradn”d“Case</p>
        <p>Study”.
output by the three first authors. The artifact produced
by ChatGPT was compared with the same artifact in
Rojina Review. The three first authors of the paper
had to agree to make an artifact acceptable. In case
of disagreement, a collaborative discussion was
facilitated to address and resolve assessment disparities.</p>
        <p>Afterward, the feedback was re-submitted to improve
the quality of the artifact. In this case, the discussion
about creating the artifact continued, and the feedback
from this phase was provided to ChatGPT until the
output was evaluated compliant for the evaluators or
the LLM could not respond better than the previous
phase.</p>
        <p>Object Design Defines the com- When the third step of the process was completed,
Document ponent design. the second step was repeated to create the next artifact.
Test Plan &amp; Test Describes how to Additionally, we noted that the language seemed more
Case Specification test the system. accurate when we asked ChatGPT to impersonate a
software engineer. For this reason, we used a generic prompt
that guided our research:
On the other hand, this project was closely supervised by
the paper’s authors. We wefraemiliar with the business Prompt of Requirement Tasks
case and the artifacts that should have been developed,
but alsoconfident of the quality of the project. We areYou have to impersonate a software
engiaware of potential threats to internal and external vnaeliedr- who has to produce the project
docuity related to this choice. However, we believe the projecmtentation of a software project. Consider
was good enough to ensure a satisfactory preliminary ast-he following problem statement to
genersessment. Following Bruegge and Dut2o6it], [we briefly ate the output:
explain the documents created for this project in1.Table&lt;problem statement content&gt;</p>
        <p>#Optional: Given that you have
&lt;addi3.3. Formulating the Waterfall Story tional info&gt; (e.g., the non-functional
requirements in the RAD)
Before starting our study, we gathered a working grouGpenerate &lt;name of the artifact&gt; for the
to determine a suitable prompt for ChatGPT. We adoptedscope of the software project that we
dea specific prompting process when interacting with Chat- fined
GPT for all artifacts to be created. This method allows t#hOeptional(only for UML artifacts) using
conduction of the activities to produce documentatiotnhe PlantUML syntax.
artifacts, simulating the phases of the Waterfall lifecycle
Model. In detail, the process includes three steps: We then started to generate the documentation in an
iterative and incremental process. The set of the
doc#1-Initial interact:ioWne set up the environmentumentation artifacts, according to the Waterfall Model,
in ChatGPT. Specifically, we adopted a single chat tthoe five main documents, and related tasks, are specified
interact and prevent the LLM from losing the proinjecTtable1.
context. Subsequently, we provided ChatGPT with an
initial prompt containing the preliminary information</p>
        <p>3.4. Data Extraction
of the project. We asked ChatGPT to provide
information concerning the problem statement. From the documentation of the project selected, we
extracted the document produced for each phase of the
#2-Artifact generat:ioton maintain the context of</p>
        <p>Waterfall Model, a set of the most important artifacts as
the output generated in the previous phase, we asked</p>
        <p>listed in Tabl1e.</p>
        <p>ChatGPT to provide the previous artifact at each Wdee- produced a prompt for each artifact that ChatGPT
velopment phase. could use to generate the artifact. For the generation
#3-Inter-rater assessm:efnotllowing the extractioonf the diagrams, we have usPeldantUML3. This
openof answers provided by ChatGPT, an inter-rater assseosus-rce tool allows users to create Unified Modeling
Lanment process was initiated to evaluate the gener3aSotuerdce code availablehatttps://github.com/plantuml/plantuml
Low Efort The desired answer is obtained with a
maximum of two prompts, does not need to be
too much articulated, and does not require
corrections, so it can easily used.</p>
        <p>Medium Ef- The desired answer is produced with
sevfort eral prompts ranging from three to five; the
response may require manual modification
where it is more complicated to have the bot
adjust the response.</p>
        <p>High Efort The desired answer is obtained with a
minimum of six very detailed prompts, and the
response requires manual corrections that
the bot cannot implement.
interaction. On the same line, the results fnoornt-he
3.5. Data Analysis functional requirements; by defining the functional ones,</p>
        <p>ChatGPT has been able to extract directly the related
To analyze the result obtained using ChatGPT, the nfirosnt-functional requirements with a single promptU.se
three authors of the paper, who have significant exCpaes-es need specific prompts for each system’s
functionrience in software engineering both from an academaiclity defined previously. Moreover, additional prompts
and enterprise perspective, had defined a set of critewrieare required to get the alternative flows. Focrlatsshe
to evaluate the efort needed by a software engineer whdoiagram, ChatGPT failed to produce a correct result with
has to be supported in creating the artifacts of thetdhoecrui-ght hierarchies, relationships, and cardinality. We
mentation. Those criteria, listed in T2a,bcloensider the observed the need to write the specific string “system
number of prompts needed and the level of adjustmecnlatss diagram” to obtain results, allowing ChatGPT to
of the prompt to reach an optimal result from ChatrGePpoTr.t associations among classes. For these reasons, the
The final acceptance of each artifact produced by ChLaLtM- fail to give a correct result.</p>
        <p>GPT was given by comparing it with the same artifact iOnn the one hand, in thsteatechart, a restricted number
Rojina Review to assess the quality. of prompts were needed to generate artifacts comparable
toRojina Review. On the other hand, the Sequence
4. Preliminary Results Diagrams needed more prompts with additional
specifications to achieve a good result.</p>
        <p>We submitted the prompts to ChatGPT for each selecteWd e needed a few prompts to generate tdehseign
artifact to address our research question and obtgaoianlse;dassigning and ordering using priority needed more
the results detailed in Ta3b.lWee started with the extrapcr-ompts. Thesubsystems division needed many prompts
tion ofscenarios. During the interaction, we noted thanatd corrections to get a result comparable with the
artiChatGPT finds dificulties in identifying key elementsfact oRfojina Review because initially, ChatGPT
proin the context. For instance, actors involved in aduspcee-d a semantically incorrect division, so we needed to
cific functionality are switched compared to the contprexotvide more details and required the PlantUML code.
of the system given in input. Therefore, we added Tahde-re were no issues fosroftware/hardware mapping ,
ditional prompts to address these issues. Subsequenbtoluyn,dary conditions, class interfaces, anddesign patterns:
we extractefdunctional requirements; ChatGPT produced ChatGPT has been able to generate a good result without
well-structured and formatted requirements after theefi rfosrtt.</p>
        <p>For the testing artifacts of the projecatt,etgohrey
partition required many prompts and was very specific</p>
      </sec>
      <sec id="sec-2-4">
        <title>4Source code availablehatttps://github.com/useocl/use</title>
        <p>for each functionality to test. Otherwitsees,t tcahsee preliminary findings suggest ChatGPT reduces time and
specifications was easier, as is using the category partiteiofonrt. Future work will involve a longitudinal study with
as input to build each test. professional feedback, exploring how prompt generation
expertise enhances real-world outputs.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>5. Threats to Validity</title>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgments</title>
      <p>Construct Validity. The main concern for construct
validity in our study concerns subject selection, paTrhtisc-work has been partially supported by the
Euroularly the version of the AI model. For evaluationp,ewane Union - NextGenerationEU through the Italian
Minused the GPT-3.5 model, the most advanced and avaisilt-ry of University and Research, Projects PRIN 2022
able version during the research. Even if the GP”TQ-u4alAI: Continuous Quality Improvement of AI-based
version has been released, the use is currently limSityestdems” (grant n. 2022B3BP5S , CUP: H53D23003510006)
by strict speed limits, and early feedback from the uasnedr PRIN 2022 PNRR ”FRINGE: context-aware FaiRness
community suggests potential stability and accureancgyineerING in complex software systEms” (grant n.
issues. P2022553SL, CUP: D53D23017340001). The opinions
presented in this article solely belong to the author(s) and
Internal Validity. To ensure robust internal valididtoy,not necessarily reflect those of the European Union
we carefully considered factors that could influeonrcTehe European Research Executive Agency. The
Eurothe outcomes derived from the LLM. Recognizing that</p>
      <p>pean Union and the granting authority cannot be held
LLMs’ responses are susceptible to prompt
formula</p>
      <p>accountable for these views.
tion, we conducted preliminary tests to identify the
most efective prompt structure1s9[, 22]. This step
was crucial to minimize variations in the model’sRree-ferences
sponses that could arise from prompt-related biases,
thereby ensuring that our findings more accurately r[1e]- DemandSage, Chatgpt statistics for 2024 (users
delfect the capabilities of the LLM rather than the nuances mographics and facts), 2024. URLh:ttps://www.
of our prompt phrasing. Additionally, each interaction demandsage.com/chatgpt-statist,iaccsc/essed:
Janwith the LLM was assessed iteratively by more authors uary 13, 2024.
through inter-rater assessment, allowing the reduc[t2i]onS. Wang, L. Huang, A. Gao, J. Ge, T. Zhang, H. Feng,
of the subjectivity of the results. We evaluated the acI-. Satyarth, M. Li, H. Zhang, V. Ng, Machine/deep
curacy of documents generated by ChatGPT using a learning for software engineering: A systematic
high-quality project from an undergraduate software literature review, IEEE Transactions on Software
engineering course as an oracle. This comparison was Engineering 49 (2023) 1188–1231. do1i:0.1109/TSE.
critical to verify that the observed results were indee2d022.3173346.
attributable to ChatGPT’s capabilities. [3] L. Belzner, T. Gabor, M. Wirsing, Large language
model assisted software engineering: prospects,
External Validity. The external validity threat exam- challenges, and a case study, in: International
Conines whether the results of a study can be generalizedference on Bridging the Gap between AI and Reality,
to other contexts. We experienced only one case study Springer, 2023, pp. 355–374.
of moderate complexity, which may limit the general[i4z]- Y. K. Dwivedi, N. Kshetri, L. Hughes, E. L. Slade,
ability of the study. Scenarios with greater developmentA. Jeyaraj, A. K. Kar, A. M. Baabdullah, A. Koohang,
complexity, diferent types of developmenet.g(., agile V. Raghavan, M. Ahuja, et al., “so what if chatgpt
instead of waterfall), and human writing prompt skillswrote it?” multidisciplinary perspectives on
oppormay afect the external validity of this research. Future tunities, challenges and implications of generative
work may involve validating the process with project conversational ai for research, practice and policy,
managers and a more significant number of software International Journal of Information Management
projects to minimize this external threat to validity. 71 (2023) 102642.</p>
      <p>[5] X. Hou, Y. Zhao, Y. Liu, Z. Yang, K. Wang, L. Li,
6. Conclusion and Future Work X. Luo, D. Lo, J. Grundy, H. Wang, Large language
models for software engineering: A systematic
litIn our study, to what extent ChatGPT can support soft- erature review, arXiv preprint arXiv:2308.10620
ware engineers in documenting waterfall projects. We (2023).
compared its use with a high-level university proje[c6t], I. Ozkaya, Application of large language models
focusing on response variability, design impact, and the to software engineering tasks: Opportunities, risks,
balance between AI support and human oversight. Our
and implications, IEEE Software 40 (2023) 4–8. [16] Q. Guo, J. Cao, X. Xie, S. Liu, X. Li, B. Chen,
doi:10.1109/MS.2023.3248401. X. Peng, Exploring the potential of chatgpt in
auto[7] M. Barenkamp, J. Rebstadt, O. Thomas, Applica- mated code refinement: An empirical study, arXiv
tions of ai in classical software engineering, AI preprint arXiv:2309.08221 (2023).</p>
      <p>Perspectives 2 (2020) 1. [17] J. Leinonen, P. Denny, S. MacNeil, S. Sarsa, S.
Bern[8] T. Xie, Intelligent software engineering: Synergy stein, J. Kim, A. Tran, A. Hellas, Comparing code
exbetween ai and software engineering, in: Proceed- planations created by students and large language
ings of the 11th Innovations in Software Engineer- models, arXiv preprint arXiv:2304.03938 (2023).
ing Conference, 2018, pp. 1–1. [18] A. Hellas, J. Leinonen, S. Sarsa, C. Koutcheme, L.
Ku[9] G. De Vito, F. Palomba, C. Gravino, S. Di Martino, janpää, J. Sorva, Exploring the responses of large
F. Ferrucci, Echo: An approach to enhance use case language models to beginner programmers’ help
quality exploiting large language models, in: 2023 requests, arXiv preprint arXiv:2306.05715 (2023).
49th Euromicro Conference on Software Engineer[-19] J. White, Q. Fu, S. Hays, M. Sandborn, C. Olea,
ing and Advanced Applications (SEAA), 2023, pp. H. Gilbert, A. Elnashar, J. Spencer-Smith, D. C.
53–60. doi:10.1109/SEAA60479.2023.00017. Schmidt, A prompt pattern catalog to enhance
[10] G. De Vito, S. Lambiase, F. Palomba, F. Ferrucci, prompt engineering with chatgpt, arXiv preprint
Meet c4se: Your new collaborator for software en- arXiv:2302.11382 (2023).
gineering tasks, in: 2023 49th Euromicro Confe[r20-] E. A. Van Dis, J. Bollen, W. Zuidema, R. van Rooij,
ence on Software Engineering and Advanced Ap- C. L. Bockting, Chatgpt: five priorities for research,
plications (SEAA), 2023, pp. 235–238. do1i0:.1109/ Nature 614 (2023) 224–226.</p>
      <p>SEAA60479.2023.00044. [21] S. Arora, A. Narayan, M. F. Chen, L. Orr, N. Guha,
[11] A. Ahmad, M. Waseem, P. Liang, M. Fahmideh, M. S. K. Bhatia, I. Chami, F. Sala, C. Ré, Ask me anything:
Aktar, T. Mikkonen, Towards human-bot collabo- A simple strategy for prompting language models,
rative software architecting with chatgpt, in: Pro- arXiv preprint arXiv:2210.02441 (2022).
ceedings of the 27th International Conferenc[e22o]n U. Lee, H. Jung, Y. Jeon, Y. Sohn, W. Hwang, J. Moon,
Evaluation and Assessment in Software Engineer- H. Kim, Few-shot is enough: exploring chatgpt
ing, 2023, pp. 279–285. prompt engineering method for automatic question
[12] J. T. Liang, C. Yang, B. A. Myers, Understanding generation in english education, Education and
the usability of ai programming assistants, arXiv Information Technologies (2023) 1–33.
preprint arXiv:2303.17125 (2023). [23] S. Jalil, S. Rafi, T. D. LaToza, K. Moran, W. Lam,
[13] M. P. Robillard, A. Marcus, C. Treude, G. Bavota, Chatgpt and software testing education: Promises
O. Chaparro, N. Ernst, M. A. Gerosa, M. God- and perils, in: 2023 IEEE International Conference
frey, M. Lanza, M. Linares-Vásquez, G. C. Murphy, on Software Testing, Verification and Validation
L. Moreno, D. Shepherd, E. Wong, On-demand de- Workshops (ICSTW), 2023, pp. 4130–4137. doi1: 0.
veloper documentation, in: 2017 IEEE International 1109/ICSTW58534.2023.00078.</p>
      <p>Conference on Software Maintenance and Evo[-24] R. K. Yin, Case study research and applications,
lution (ICSME), 2017, pp. 479–483. do1i:0.1109/ volume 6, Sage Thousand Oaks, CA, 2018.</p>
      <p>ICSME.2017.17. [25] C. Wohlin, P. Runeson, M. Höst, M. C. Ohlsson,
[14] E. Aghajani, C. Nagy, M. Linares-Vásquez, B. Regnell, A. Wesslén, Experimentation in software
L. Moreno, G. Bavota, M. Lanza, D. C. Shepherd, engineering, Springer Science &amp; Business Media,
Software documentation: The practitioners’ 2012.
perspective, in: Proceedings of the ACM/IEE[E26] B. Bruegge, A. H. Dutoit, Object–oriented software
42nd International Conference on Software engineering. using uml, patterns, and java,
LearnEngineering, ICSE ’20, Association for Computing ing 5 (2009) 7.</p>
      <p>Machinery, New York, NY, USA, 2020, p. 590–601. [27] J. Cámara, J. Troya, L. Burgueño, A. Vallecillo, On
URL: https://doi.org/10.1145/3377811.338040.5 the assessment of generative ai in modeling tasks:
doi:10.1145/3377811.3380405. an experience report with chatgpt and uml., Softw
[15] P. Vaithilingam, T. Zhang, E. L. Glassman, Expecta- Syst Model 22 (2023) 781–793. dohi:ttps://doi.
tion vs. experience: Evaluating the usability of code org/10.1007/s10270-023-01105-5.
generation tools powered by large language models,
in: Extended Abstracts of the 2022 CHI Conference
on Human Factors in Computing Systems, CHI EA
’22, Association for Computing Machinery, New
York, NY, USA, 2022. URL:https://doi.org/10.1145/
3491101.3519665. doi:10.1145/3491101.3519665.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>