<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Exploring LLM-based Data Augmentation Techniques for Code Comment Quality Classification</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Priyam Dalmia</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>American Express</institution>
          ,
          <addr-line>Gurugram, Haryana</addr-line>
          ,
          <country country="IN">India -</country>
          <addr-line>122003</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>In software engineering, the significance of code comments is not uniform, underscoring the requirement for precise methods that can discern their inherent quality. In this investigation, we aimed to improve the classification of the utility of code comments by integrating both traditionally annotated datasets and synthetic data augmentation techniques. For the latter, we utilized the capabilities of GPT-3.5-turbo, a contemporary linguistic model, to label additional comment instances. We used Logistic regression to create a baseline model for comment usefulness classification task. We observed that, irrespective of the inclusion of synthetic data, the classification eficacy remained consistent, recording an F1 score of approximately 0.80 both before and after the synthetic data amalgamation. This study elucidates the implications and boundaries of deploying synthetic data augmentation in the context of evaluating code comment relevance.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Large Language Models</kwd>
        <kwd>GPT-3</kwd>
        <kwd>5</kwd>
        <kwd>Logistic Regression</kwd>
        <kwd>Comment Classification</kwd>
        <kwd>Data Augmentation</kwd>
        <kwd>Qualitative Analysis</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>In today’s digital age, software has embedded itself as the cornerstone of many pivotal sectors,
ranging from finance and healthcare to transportation. As organizations tirelessly adapt to
evolving requirements, the existing software is continually modified and new software gets
written. Thus, the volume of source code increases constantly, leading to increased code
complexity to support new software functionality. Maintaining this large amount of source
code is a crucial phase of Software Development Life Cycle (SDLC).</p>
      <p>
        Rapid cycles of development often force quick and dirty resolution of bugs, introduction of
new source code, or the updating of existing applications. Such accelerated timelines can result
in suboptimal coding practices. As software undergoes changes, associated documentation,
such as requirement specifications and high-level designs, may become outdated. In many cases,
prior developers’ insights or assistance is unavailable. This highlights the need for structured,
quality-focused development processes, with program comprehension serving as a key method
for maintaining existing source code.[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>Considering the evolving nature of software design, the primary dependable sources of
information become test execution traces, static analyses of programs, and code comments.
This research emphasizes the importance of code comments as indicators of program design
for both developers and automated systems. Code comments provide context on the reasoning
and objectives of the associated source code, facilitating better understanding and maintenance.
However, the quality of comments varies, necessitating automated tools to evaluate their
usefulness.</p>
      <p>A recurring challenge in studying code comment usefulness is the limited availability of
comprehensive, well-annotated datasets that cover the diverse nature of comments across
diferent programming contexts. To address this, there’s a need for innovative methods to
augment existing data to enhance model performance on new, real-world comments. Our
approach in this study combines manual data labeling with synthetic data augmentation using
the GPT-3.5-turbo language model.</p>
      <p>In this paper, we focus on a binary classification task for source code comments in the C
language, categorizing them as ’Useful’ or ’Not Useful’. We begin with a dataset of over 11,000
manually-labeled comments and use Logistic Regression for initial comment classification. This
dataset is then augmented with more than 200 GPT-labelled samples to test for performance
improvements. Interestingly, the model’s performance was consistent, yielding an F1 score of
0.80 for both the original and augmented datasets.</p>
      <p>Through this study’s combination of manual annotation and synthetic data augmentation,
we aim to provide a contribution to the current understanding of code comment usefulness
classification. The goal is to address existing challenges and promote the development of
adaptable models for the ever-changing landscape of software development.</p>
      <p>The rest of the paper is organized as follows. Section 2 discusses the background work done in
the domain of comment classification. The task and dataset are described in 3. Our methodology
is discussed in section 4. Results are addressed in section 5. Section 6 concludes the paper.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Software metadata [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] plays a crucial role in the maintenance of code and its subsequent
understanding. Numerous tools have been developed to assist in extracting knowledge from
software metadata, which includes runtime traces and structural attributes of code [
        <xref ref-type="bibr" rid="ref10 ref11 ref3 ref4 ref5 ref6 ref7 ref8 ref9">3, 4, 5, 6, 7,
8, 9, 10, 11</xref>
        ].
      </p>
      <p>
        In the realm of mining code comments and assessing their quality, several authors have
conducted research. Steidl et al. [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] employ techniques such as Levenshtein distance and
comment length to gauge the similarity of words in code-comment pairs, efectively filtering
out trivial and non-informative comments. Rahman et al. [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] focus on distinguishing useful
from non-useful code review comments within review portals, drawing insights from attributes
identified in a survey conducted with Microsoft developers [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Majumdar et al. [
        <xref ref-type="bibr" rid="ref15 ref16">15, 16, 17, 18</xref>
        ]
have introduced a framework for evaluating comments based on concepts crucial for code
comprehension. Their approach involves the development of textual and code correlation
features, utilizing a knowledge graph to semantically interpret the information within comments.
These approaches employ both semantic and structural features to address the prediction
problem of distinguishing useful from non-useful comments, ultimately contributing to the
process of decluttering codebases
      </p>
      <p>
        In light of the emergence of large language models, such as GPT-3.5 or llama [19], it becomes
crucial to assess the quality of code comments and compare them to human interpretation. The
IRSE track at FIRE 2023 [20] expands upon the approach presented in a prior work [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. It delves
into the exploration of various vector space models [21] and features for binary classification
and evaluation of comments, specifically in the context of their role in comprehending code.
Furthermore, this track conducts a comparative analysis of the prediction model’s performance
when GPT-generated labels for code and comment quality, extracted from open-source software,
are included.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Task and Dataset Description</title>
      <p>In this section, we have described the task addressed in this paper. We aim to implement a binary
classification system to classify source code comments into useful and not useful. The procedure
takes a code comment with associated lines of code as input. The output will be a label such as
useful or not useful for the corresponding comment, which helps developers comprehend the
associated code. Classical machine learning algorithms such as logistic regression can be used
to develop the classification system. The two classes of source code comments can be described
as follows:
• Useful - The given comment is relevant to the corresponding source code.
• Not Useful - The given comment is not relevant to the corresponding source code.</p>
      <p>A dataset consisting of over 11000 code-comment pairs written in C language is used in our
work. Each instance of data consists of comment text, a surrounding code snippet, and a label
that specifies whether the comment is useful or not. The whole dataset is collected from GitHub
and annotated by a team of 14 annotators. A sample data is illustrated in table 1.</p>
      <p>There is another similar dataset that is created and used in this work. That dataset is created
by getting code-comment pairs from Github, and the label of useful or not useful was given
by GPT. This dataset has a similar structure to the original dataset, and is used to augment the
original dataset later on.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Working Principle</title>
      <p>We use logistic regression to implement the binary classification functionality. The system
takes comments as well as surrounding code snippets as input. We create embeddings of each
piece of code and the associated comment using a pre-trained Universal sentence encoder. The
output of the embedding process is used to train both machine learning model. The training
dataset consists of 80% data instances along with their labels. The rest is used for testing, in
both experiments. The description of the model is discussed in the following section.
#
1
2</p>
      <p>/*cr to cr,nul*/
3
/*convert minor status code
(underlying routine error) to text*/
4.1. Logistic Regression
We use logistic regression for the binary comment classification task which uses a logistic
function to keep the regression output between 0 and 1. The logistic function is defined as
follows:</p>
      <p>=  + 
() =</p>
      <p>1
1 + (− )</p>
      <p>Label
Not Useful
Not Useful
Useful
(1)
(2)</p>
      <p>The output of the linear regression equation (refer to equation 1) is passed to the logistic
function (see equation 2). The probability value generated by the logistic function is used for
binary class prediction based on the acceptance threshold. We keep the threshold value of
0.6 in favor of the useful comment class. We have a three-dimensional input feature extracted
from each training instance which is passed to the regression function. The Cross entropy loss
function is used during training for the hyper-parameter tuning.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>We train our logistic regression model on both datasets. The original dataset has 11,452 samples
and the GPT generated data has 233 samples. The first experiment uses only the original data
and produces the following scores.</p>
      <p>After augmenting the original dataset with the GPT generated data, the following results
were seen.</p>
      <p>Accuracy
Original Dataset 81.97293758
Augmented Dataset 81.2152332</p>
      <p>Precision
0.792349986
0.794243029</p>
      <p>Recall
0.817765006
0.803115991</p>
      <p>F1 Score
0.801166962
0.798028602</p>
      <p>The very slight change in the scores across metrics suggests that the newly generated data
was practically indiferentiable from the original dataset, highlighting the validity of using GPT
generated data for data augmentation.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>This paper has addressed a binary classification problem in the domain of source code comment
classification. The classification has been done based on the usefulness of the comment present
within a source code written in C language. We have used logistic regression as our base
classification method. We conducted two experiments, one with the original dataset and another
with the original dataset plus the synthetic GPT generated data. The similar results in both
cases show that the synthetic data falls in line with the original dataset, and how synthetic
data creation can help in efectively increasing data volume required for training models. The
synthetic data’s correctness as compared to the original dataset is proven by the results shown
above. Synthetic data generation can help a lot with data augmentation, finding its use in many
pipelines.
approach to program comprehension from code comments, in: Advanced Computing and
Systems for Security, Springer, 2020, pp. 29–42.
[17] S. Majumdar, A. Bandyopadhyay, S. Chattopadhyay, P. P. Das, P. D. Clough, P. Majumder,
Overview of the irse track at fire 2022: Information retrieval in software engineering, in:
Forum for Information Retrieval Evaluation, ACM, 2022.
[18] S. Majumdar, A. Bandyopadhyay, P. P. Das, P. Clough, S. Chattopadhyay, P. Majumder,
Can we predict useful comments in source codes?-analysis of findings from information
retrieval in software engineering track@ fire 2022, in: Proceedings of the 14th Annual
Meeting of the Forum for Information Retrieval Evaluation, 2022, pp. 15–17.
[19] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan,
P. Shyam, G. Sastry, A. Askell, et al., Language models are few-shot learners, Advances in
neural information processing systems 33 (2020) 1877–1901.
[20] S. Majumdar, S. Paul, D. Paul, A. Bandyopadhyay, B. Dave, S. Chattopadhyay, P. P. Das, P. D.</p>
      <p>Clough, P. Majumder, Generative ai for software metadata: Overview of the information
retrieval in software engineering track at fire 2023, in: Forum for Information Retrieval
Evaluation, ACM, 2023.
[21] S. Majumdar, A. Varshney, P. P. Das, P. D. Clough, S. Chattopadhyay, An efective
lowdimensional software code representation using bert and elmo, in: 2022 IEEE 22nd
International Conference on Software Quality, Reliability and Security (QRS), IEEE, 2022,
pp. 763–774.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Berón</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. R.</given-names>
            <surname>Henriques</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Varanda</surname>
          </string-name>
          <string-name>
            <surname>Pereira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Uzal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. A.</given-names>
            <surname>Montejano</surname>
          </string-name>
          ,
          <article-title>A language processing tool for program comprehension</article-title>
          , in: XII Congreso
          <string-name>
            <surname>Argentino de Ciencias de la Computación</surname>
          </string-name>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>C. B. de Souza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Anquetil</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. M. de Oliveira</surname>
          </string-name>
          ,
          <article-title>A study of the documentation essential to software maintenance</article-title>
          ,
          <source>Conference on Design of communication, ACM</source>
          ,
          <year>2005</year>
          , pp.
          <fpage>68</fpage>
          -
          <lpage>75</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>L.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Yuan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <article-title>Hotcomments: how to make program comments more useful?, in: Conference on Programming language design and implementation (SIGPLAN)</article-title>
          , ACM,
          <year>2007</year>
          , pp.
          <fpage>20</fpage>
          -
          <lpage>27</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Majumdar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Papdeja</surname>
          </string-name>
          ,
          <string-name>
            <surname>P. P. Das</surname>
            ,
            <given-names>S. K.</given-names>
          </string-name>
          <string-name>
            <surname>Ghosh</surname>
          </string-name>
          ,
          <article-title>Smartkt: a search framework to assist program comprehension using smart knowledge transfer</article-title>
          ,
          <source>in: 2019 IEEE 19th International Conference on Software Quality, Reliability and Security (QRS)</source>
          , IEEE,
          <year>2019</year>
          , pp.
          <fpage>97</fpage>
          -
          <lpage>108</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>N.</given-names>
            <surname>Chatterjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Majumdar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. R.</given-names>
            <surname>Sahoo</surname>
          </string-name>
          ,
          <string-name>
            <surname>P. P. Das</surname>
          </string-name>
          ,
          <article-title>Debugging multi-threaded applications using pin-augmented gdb (pgdb)</article-title>
          ,
          <source>in: International conference on software engineering research and practice (SERP)</source>
          . Springer,
          <year>2015</year>
          , pp.
          <fpage>109</fpage>
          -
          <lpage>115</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Majumdar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Chatterjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. R.</given-names>
            <surname>Sahoo</surname>
          </string-name>
          ,
          <string-name>
            <surname>P. P. Das</surname>
          </string-name>
          ,
          <article-title>D-cube: tool for dynamic design discovery from multi-threaded applications using pin</article-title>
          ,
          <source>in: 2016 IEEE International Conference on Software Quality, Reliability and Security (QRS)</source>
          , IEEE,
          <year>2016</year>
          , pp.
          <fpage>25</fpage>
          -
          <lpage>32</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Majumdar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Chatterjee</surname>
          </string-name>
          ,
          <string-name>
            <surname>P. P. Das</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Chakrabarti</surname>
          </string-name>
          ,
          <article-title>A mathematical framework for design discovery from multi-threaded applications using neural sequence solvers</article-title>
          ,
          <source>Innovations in Systems and Software Engineering</source>
          <volume>17</volume>
          (
          <year>2021</year>
          )
          <fpage>289</fpage>
          -
          <lpage>307</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Majumdar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Chatterjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Pratim Das</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Chakrabarti</surname>
          </string-name>
          ,
          <article-title>Dcube_ nn d cube nn: Tool for dynamic design discovery from multi-threaded applications using neural sequence models</article-title>
          ,
          <source>Advanced Computing and Systems for Security:</source>
          Volume
          <volume>14</volume>
          (
          <year>2021</year>
          )
          <fpage>75</fpage>
          -
          <lpage>92</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Siegmund</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Peitek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Parnin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Apel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hofmeister</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Kästner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Begel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bethmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Brechmann</surname>
          </string-name>
          ,
          <article-title>Measuring neural eficiency of program comprehension</article-title>
          ,
          <source>in: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>140</fpage>
          -
          <lpage>150</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Le</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. D.</given-names>
            <surname>Gotmare</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. D.</given-names>
            <surname>Bui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. C.</given-names>
            <surname>Hoi</surname>
          </string-name>
          , Codet5+:
          <article-title>Open code large language models for code understanding and generation</article-title>
          ,
          <source>arXiv preprint arXiv:2305.07922</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Freitas</surname>
          </string-name>
          , D. da Cruz,
          <string-name>
            <given-names>P. R.</given-names>
            <surname>Henriques</surname>
          </string-name>
          ,
          <article-title>A comment analysis approach for program comprehension</article-title>
          ,
          <source>Annual Software Engineering Workshop</source>
          (SEW), IEEE,
          <year>2012</year>
          , pp.
          <fpage>11</fpage>
          -
          <lpage>20</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>D.</given-names>
            <surname>Steidl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Hummel</surname>
          </string-name>
          , E. Juergens,
          <article-title>Quality analysis of source code comments</article-title>
          ,
          <source>International Conference on Program Comprehension (ICPC)</source>
          , IEEE,
          <year>2013</year>
          , pp.
          <fpage>83</fpage>
          -
          <lpage>92</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>M. M. Rahman</surname>
            ,
            <given-names>C. K.</given-names>
          </string-name>
          <string-name>
            <surname>Roy</surname>
          </string-name>
          , R. G. Kula,
          <article-title>Predicting usefulness of code review comments using textual features and developer experience</article-title>
          ,
          <source>International Conference on Mining Software Repositories (MSR)</source>
          , IEEE,
          <year>2017</year>
          , pp.
          <fpage>215</fpage>
          -
          <lpage>226</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bosu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Greiler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bird</surname>
          </string-name>
          ,
          <article-title>Characteristics of useful code reviews: An empirical study at microsoft</article-title>
          ,
          <source>Working Conference on Mining Software Repositories, IEEE</source>
          ,
          <year>2015</year>
          , pp.
          <fpage>146</fpage>
          -
          <lpage>156</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>S.</given-names>
            <surname>Majumdar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bansal</surname>
          </string-name>
          ,
          <string-name>
            <surname>P. P. Das</surname>
            ,
            <given-names>P. D.</given-names>
          </string-name>
          <string-name>
            <surname>Clough</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Datta</surname>
            ,
            <given-names>S. K.</given-names>
          </string-name>
          <string-name>
            <surname>Ghosh</surname>
          </string-name>
          ,
          <article-title>Automated evaluation of comments to aid software maintenance</article-title>
          ,
          <source>Journal of Software: Evolution and Process</source>
          <volume>34</volume>
          (
          <year>2022</year>
          )
          <article-title>e2463</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>S.</given-names>
            <surname>Majumdar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Papdeja</surname>
          </string-name>
          ,
          <string-name>
            <surname>P. P. Das</surname>
            ,
            <given-names>S. K.</given-names>
          </string-name>
          <string-name>
            <surname>Ghosh</surname>
          </string-name>
          ,
          <article-title>Comment-mine-a semantic search</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>