<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Retrieval in Software Engineering utilizing a pre-trained BERT</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Koyel Ghosh</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Apurbalal Senapati</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kokrajhar</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Assam</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>India</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Information Retrieval in Software Engineering</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Binary classification</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Central Institute of Technology</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Forum for Information Retrieval Evaluation</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Short-Term Memory)[4], BiLSTM (Bidirectional Long Short-Term Memory)</institution>
          ,
          <addr-line>etc. Here, we use</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>The task is to detect whether a source code comment is useful or not for a given comment, and the surrounding code is paired together as input. IRSE (Information Retrieval in Software Engineering) shared task organized by FIRE 2022 (Forum for Information Retrieval Evaluation), gives a binary classification task where a system classifies Comments and Surrounding Code Context pairs into two classes: (a) USEFUL or (b) NOT_USEFUL. To do the task, we experimented with the roberta-base model, and the result was 0.9047 in F1 Marco. Our submission gets the second position out of all submissions.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>model ⋆</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-3">
      <title>2. Related work</title>
      <p>https://github.com/BrainLearns (K. Ghosh)
classify comments as useful, partially useful, and not useful. Their result was precision and
recall scores of 86.27% and 86.42%, respectively. As per [7], annotating programs with natural
language comments is a standard programming practice to increase the readability of code.
They manually annotate concepts for 5600 comments extracted from 672 C/C++ files/projects
crawled from code repositories like GitHub. Comment-Mine extracts 38,992 concepts, out of
which 79.8% is correct and validated using manual annotation.</p>
    </sec>
    <sec id="sec-4">
      <title>3. Experimental Setup</title>
      <sec id="sec-4-1">
        <title>3.1. Dataset</title>
        <p>IRSE, a shared task organized by FIRE (Forum for Information Retrieval Evaluation), published
the dataset containing 8047 Comments and Surrounding Code Context pairs training set along
with Class, i.e. useful or not_useful. A total of 1001 Comments and Surrounding Code Context
pairs are given on the test set. Table 1 shows the details dataset statistics.</p>
        <p>Label encoding: Here, we just convert N OT_USEFUL to “0” and U SEFUL to “1” for the
Class column.</p>
        <sec id="sec-4-1-1">
          <title>IRSE NOT USEFUL</title>
          <p>Training set 3710
Test set 719</p>
        </sec>
        <sec id="sec-4-1-2">
          <title>USEFUL</title>
          <p>4337
282
Total
8047
1001</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>3.2. Pretrained BERT models</title>
        <p>BERT models are trained on a large raw text (without human labeling) corpus in a self-supervised
way. Figure 1 shows the representation of the approach. We did several experiments and found
the below-mentioned (Table 3) best hyperparameter combination.
value of the same is NOT_USEFUL),    _    = Precision of NOT_USEFUL class,     
= Precision of USEFUL class,    _    = Recall of NOT_USEFUL class,  1   _    =
F1 score of NOT_USEFUL class,  1     = F1 score of USEFUL class,    _    = The
total number of NOT_USEFUL class text present in the test set,      = The total number of
USEFUL class text present in the test set.</p>
        <p>We execute our code up to 10 epochs and take the best result out of all the epochs. Here, we
notice overfitting while fine-tuning pre-trained BERT models. After epoch 4, validation loss
increases, and training loss decreases. We didn’t try dropout layer here.</p>
        <p>Model on</p>
        <p>IRSE
roberta-base</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>In this paper, our task is to classify a comment and Surrounding Code Context pair to USEFUL
or NOT_USEFUL. We used a pre-trained BERT model. During the method, we realize that the
maximum length of a comment for the entire set is six, and for Surrounding Code Context, it’s
821. As BERT’s maximum input length capacity is 512, we can experiment with longformer2,
but it needs a good configuration machine otherwise may face memory issues. Later, dual
BERT3 can be used in place of a single BERT.
[1] S. Majumdar, A. Bandyopadhyay, P. P. Das, P. D Clough, S. Chattopadhyay, P. Majumder,
Overview of the IRSE subtrack at FIRE 2022: Information Retreival in Software Engineering,
in: Working Notes of FIRE 2022 - Forum for Information Retrieval Evaluation, ACM, 2022.
[2] T. Mikolov, I. Sutskever, K. Chen, G. Corrado, J. Dean, Distributed representations of words
and phrases and their compositionality, Advances in Neural Information Processing Systems
26 (2013).
[3] A. Sherstinsky, Fundamentals of recurrent neural network (RNN) and long short-term
memory (LSTM) network, CoRR abs/1808.03314 (2018). URL: http://arxiv.org/abs/1808.03314.
a r X i v : 1 8 0 8 . 0 3 3 1 4 .
[4] S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Computation 9 (1997)
1735–1780.
[5] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, V.
Stoyanov, Roberta: A robustly optimized BERT pretraining approach, CoRR abs/1907.11692
(2019). URL: http://arxiv.org/abs/1907.11692. a r X i v : 1 9 0 7 . 1 1 6 9 2 .
2https://huggingface.co/docs/transformers/model_doc/longformer
3https://towardsdatascience.com/siamese-and-dual-bert-for-multi-text-classification-c6552d435533
[6] S. Majumdar, A. Bansal, P. Das, P. Clough, K. Datta, S. Ghosh, Automated evaluation of
comments to aid software maintenance, Journal of Software: Evolution and Process 34
(2022). doi:1 0 . 1 0 0 2 / s m r . 2 4 6 3 .
[7] S. Majumdar, S. Papdeja, P. Das, S. Ghosh, Comment-Mine—A Semantic Search
Approach to Program Comprehension from Code Comments, 2020, pp. 29–42. doi:1 0 . 1 0 0 7 /
9 7 8 - 9 8 1 - 1 5 - 2 9 3 0 - 6 _ 3 .</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>