<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>From Abstract to Highlight: Automatic Research Highlight Generation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Anindita Bhattacharya</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tohida Rehman</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Jadavpur University</institution>
          ,
          <addr-line>Kolkata</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2026</year>
      </pub-date>
      <abstract>
        <p>s. For this, we fine-tuned pre-trained T5-base and BART-base models for abstractive summarization, aiming to generate short, compact, meaningful and concise highlights that capture the essential ideas of each paper. The quality of the generated highlights is evaluated using standard metrics, including ROUGE-1, ROUGE-2, ROUGE-L, METEOR, BERTScore, and SciBERTScore. Our fine-tuned T5-base model achieves the best performance, with the system proposed by our team The NLP Explorers attaining a ROUGE-L F1 score of 22.94% and securing the 5th position in the SciHigh track of FIRE 2025. These results demonstrate that transformer-based models can serve as efective tools for enhancing scientific communication by automatically generating reliable and easy-to-read research paper highlights.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Research Paper Highlights</kwd>
        <kwd>Pre-trained Language Models</kwd>
        <kwd>Fine-tuning</kwd>
        <kwd>Natural Language Generation</kwd>
        <kwd>Abstractive Summarization</kwd>
        <kwd>Evaluation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Research paper highlights are short bullet points that give readers a quick idea of the main contributions
of a study. They make papers easier to understand and help readers quickly decide whether a paper is
relevant to their work. However, highlights are usually written by authors, which takes time and can
vary in quality.</p>
      <p>With the rapid growth of scientific publications, there is increasing interest in automating this task.
Recent progress in Natural Language Processing (NLP), especially transformer-based models, has shown
strong results in text summarization. Since abstracts already summarize the paper, they provide a good
starting point for automatically generating highlights.</p>
      <p>In this work, we fine-tuned the T5-base model and the BART-base model to generate research paper
highlights directly from abstracts. The models were trained to generate short, clear, and meaningful
research highlights that captures the main ideas. We evaluated the models’ performance using standard
metrics such as ROUGE-1, ROUGE-2, ROUGE-L, METEOR, BERTScore, and SciBERTScore which
measure the similarity between generated and reference highlights.</p>
      <p>The main contributions of this study are as follows:
1. Fine-tuning the T5-base model and BART-base model for the task of highlight generation from
abstracts.
2. Evaluating the models’ performance using metrics like ROUGE, METEOR, BERTScore, and</p>
      <p>SciBERTScore.
3. Showing that transformer-based summarization can support faster and more consistent highlight
generation.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Literature Review</title>
      <p>
        Automatic text summarization has been studied for decades, beginning with some of the earliest
extractive approaches. Luhn et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] pioneered a method in 1958 that selected sentences based on
the frequency of significant words while discarding common terms. This early work established the
foundation for extractive summarization, where the task is to select existing sentences from a document
to represent its content. Over time, extractive methods evolved with more sophisticated heuristics and
statistical models to better identify important sentences.
      </p>
      <p>
        The field took a major turn with the introduction of neural sequence-to-sequence models, which
allowed abstractive summarization by generating new sentences rather than simply extracting them.
Techniques such as attention-based encoders and pointer-generator networks further improved the
quality of summaries by addressing long-range dependencies, handling out-of-vocabulary (OOV) words,
and reducing the problem of repetitive phrase generation [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. These innovations enabled models to
generate summaries that were not only concise but also fluent and semantically aligned with the input.
      </p>
      <p>
        The next breakthrough came with the transformer architecture [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], which has since dominated
Natural Language Processing (NLP). Transformers made it possible to build very large pre-trained
models that could be fine-tuned for specific downstream tasks. Models such as T5 [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], BART [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], and
PEGASUS [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] demonstrated state-of-the-art performance across multiple summarization datasets. These
pre-trained language models captured rich linguistic patterns during large-scale pre-training, which
translated into strong results in domain-specific fine-tuning tasks. Rehman et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] developed a
GRUbased encoder-decoder with Bahdanau attention to generate concise English news summaries, achieving
improved performance for headline style outputs. Rehman et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] further evaluated pre-trained models
such as Pegasus-CNN-DailyMail, T5-base, and BART-large-CNN across multiple datasets, including
CNN-DailyMail, SAMSum, and BillSum, to benchmark summarization performance. Bhattacharya et
al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] conducted a comparative analysis of abstractive summarization models for summarizing clinical
radiology reports using the MIMIC-CXR dataset. Their study evaluated models such as T5-base,
BARTbase, PEGASUS-x-base, ChatGPT-4, LLaMA-3-8B, and a Pointer Generator Network with a coverage
mechanism, assessing their performance using ROUGE, METEOR, and BERTScore metrics to identify
the strengths and limitations of each in generating concise medical summaries.
      </p>
      <p>
        While text summarization has been widely studied, generating research paper highlights is a more
recent and specialized task. Unlike traditional summaries, highlights are typically short bullet points
that emphasize the most important contributions of a paper. Early attempts to address this include
Collins et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], who used supervised learning with a binary classifier to identify highlight-worthy
sentences, and Cagliero and Quatra [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], who employed regression methods to select the top-k most
relevant sentences.
      </p>
      <p>
        Rehman et al. made significant contributions in this area with a series of studies. They first proposed an
abstractive approach using pointer-generator networks [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] to generate highlights directly from research
abstracts. This method was important because it went beyond extraction and attempted to generate
highlights that were concise, coherent, and aligned with the abstract. To further improve this approach,
they incorporated Named Entity Recognition (NER) [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] into the highlight generation pipeline, showing
that domain-specific knowledge could enhance the informativeness of generated highlights. Rehman
et al. [13] also explored diferent types of embeddings with a pointer-generator model to evaluate
highlights generation using various input combination, such as the abstract only or a combination
of sections including the abstract, introduction, and conclusion, employing ELMo embeddings. Later,
they carried out a comprehensive study comparing diferent deep learning models with SciBERT word
embeddings for highlight generation [14], ofering one of the most detailed benchmarks in this emerging
ifeld and contributed MixSub dataset. Together, these works form the basis for current research in
automated highlight generation.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Dataset</title>
      <p>We used the MixSub dataset contributed by Rehman et. al. [14] for the highlight generation task.
This dataset was built by collecting research articles from ScienceDirect, comprising 19,785 articles
published in 2020 across multiple domains. Each article is paired with its corresponding author-written
research highlights, making it well-suited for training and evaluating highlight generation systems.
Every entry in the dataset contains an abstract along with the highlights section. The dataset is divided
into training, validation, and test splits using an 80:10:10 ratio. 10,000 samples from the training set,
1,985 samples from the validation set, and 1,840 samples from the test set were supplied for the SciHigh
track at FIRE 2025 experiments. Figure 1 presents an illustrative example from the MixSub dataset. The
author-written highlights are not direct extracts from the abstract but instead these are paraphrased
and condensed content, exemplifying abstractive summarization.</p>
      <sec id="sec-3-1">
        <title>Abstract: “The current study introduces the flexible approach of mixture components to model the</title>
        <p>spatiotemporal interaction for ranking of hazardous sites and compares the model performance with
the conventional methods .</p>
      </sec>
      <sec id="sec-3-2">
        <title>In case of predictive accuracy based on in sample errors the</title>
      </sec>
      <sec id="sec-3-3">
        <title>Mixture 5 demonstrated superior performance in majority of the cases indicating the advan</title>
        <p>
          tage of mixture approach to accurately predict crash counts. LPML was also calculated as
a cross validation measure based on out of sample errors and this criterion also established
the dominance of Mixture 5 further reinforcing the superiority of the mixture approach from
diferent perspectives .”
Author-written research highlights:
▶ “A comprehensive evaluation was conducted for 9 spatiotemporal crash frequency models.”
▶ “The model performance was evaluated based on both in sample and out of sample errors.”
▶ “The site ranking performance of the proposed models was assessed using three criteria.”
▶ “A flexible approach was proposed which accommodates the variations of time trend across space.”
▶ “pTrhedeircetsceraarschhcfoinudnitns.g”s indicated the advantage of the proposed mixture approach to accurately
The Text-to-Text Transfer Transformer (T5) is built on the encoder–decoder framework and represents
a refined adaptation of the transformer architecture introduced by Vaswani et al. [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. One of the key
contributions of T5 is its unified text-to-text paradigm, where a wide range of NLP tasks including
machine translation, question answering, text classification, and summarization are cast into a single
format: transforming an input text into an output text. This flexibility allows T5 to be applied across
diverse tasks without requiring significant changes in the model architecture.
        </p>
        <p>During pre-training, T5 is trained on a span-corruption objective, where random spans of text are
masked and the model is required to reconstruct the missing parts. This enables the model to learn
both syntactic and semantic dependencies across diferent contexts. The T5-base variant used in this
study contains 220 million parameters, making it computationally eficient compared to larger variants
while still ofering strong performance in summarization tasks.</p>
        <p>The model’s encoder is responsible for creating contextual embeddings of the input sequence, while
the decoder generates the corresponding output sequence in an autoregressive manner. This
encoder–decoder synergy makes T5 particularly efective for abstractive summarization tasks, such as
research highlight generation, where the goal is not just to extract key sentences but to generate
coherent, concise, and human-like summaries.</p>
        <sec id="sec-3-3-1">
          <title>4.2. BART-Base</title>
          <p>Bidirectional and Auto-Regressive Transformers (BART) is a sequence-to-sequence model introduced
by Lewis et al. [15], designed to combine the strengths of bidirectional encoder models like BERT
and autoregressive decoder models like GPT. Built upon the standard transformer encoder–decoder
architecture, BART is particularly well-suited for text generation tasks such as abstractive summarization,
paraphrasing, and dialogue modeling.</p>
          <p>BART is pre-trained using a denoising autoencoder objective, where the original text is corrupted
using various noise functions—including token masking, token deletion, sentence shufling, and
document rotation—and the model is trained to reconstruct the clean text from the noisy input. This flexible
corruption strategy allows BART to learn robust representations of linguistic structure and semantics,
enabling strong downstream performance across diverse natural language processing tasks.</p>
          <p>The BART-base variant used in this study contains 139 million parameters, making it lighter than
its larger counterparts while still retaining competitive generative capabilities. Its encoder captures
bidirectional contextual information from the input text, and its decoder generates output sequences
autoregressively, predicting one token at a time based on previously generated tokens. This architecture
makes BART highly efective for abstractive summarization tasks, where the model must integrate
information across sentences and generate fluent, coherent, and human-like summaries rather than
simply extracting content from the source.</p>
          <p>In applications such as research highlight generation, BART-base ofers an optimal balance between
computational eficiency and summarization quality, allowing it to generate concise, contextually rich
summaries that maintain the essential meaning of the input document.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>5. Performance Evaluation Metrics</title>
      <p>To assess the quality of the generated research highlights, we employed widely used automatic evaluation
metrics from the field of text summarization. These metrics are designed to compare the model-generated
summaries with human-written reference highlights, thereby providing an objective measure of accuracy
and fluency.</p>
      <p>We primarily focused on ROUGE, METEOR, BERTScore, and SciBERTScore, these standard
benchmarks in summarization research.</p>
      <p>1. ROUGE (Recall-Oriented Understudy for Gisting Evaluation) [16]:</p>
      <p>ROUGE is one of the most commonly used evaluation measures for summarization tasks. It
calculates the overlap between the generated summary and the reference text based on n-grams,
word sequences, and sentence-level structures. In our experiments, we used three key variants:
a) ROUGE-1: Measures unigram (single word) overlap, which reflects the model’s ability to
capture the core words from the reference highlights.
b) ROUGE-2: Considers bigram (two consecutive words) overlap, ofering insight into the
lfuency and coherence of the generated highlights.
c) ROUGE-L: Based on the longest common subsequence, this metric evaluates the similarity
of sentence structures and captures how well the generated highlights preserve the ordering
and organization of information.
2. METEOR (Metric for Evaluation of Translation with Explicit ORdering) [17]:</p>
      <p>METEOR complements ROUGE by focusing on semantic similarity and sentence-level alignment
between the generated and reference highlights. Unlike ROUGE, which relies heavily on
surfacelevel n-gram matches, METEOR accounts for synonyms, stemming, and word order variations.
This makes it more sensitive to the actual meaning conveyed in the generated output, ensuring
that highlights are not only word-accurate but also semantically faithful to the reference.
3. BERTScore (Bidirectional Encoder Representations from Transformers Score) [18]:
BERTScore evaluates the semantic similarity between the generated and reference highlights
using contextual embeddings from the BERT model. Instead of relying on exact word matches,
it measures token-level cosine similarity in the embedding space, allowing it to capture deeper
semantic relationships and paraphrased expressions. This makes BERTScore particularly efective
for abstractive summarization tasks, where meaning preservation is more important than
surfacelevel overlap.
4. SciBERTScore (Scientific BERTScore):</p>
      <p>SciBERTScore is an adaptation of BERTScore that uses embeddings from the SciBERT model,
which is pre-trained on scientific and biomedical corpora. This domain-specific version enhances
the evaluation of summaries in scientific texts—such as radiology or biomedical reports—by better
understanding specialized terminology and contextual nuances. It provides a more accurate
measure of semantic fidelity for summaries in technical and research-oriented datasets.</p>
    </sec>
    <sec id="sec-5">
      <title>6. Experimental Setup</title>
      <p>In this section, we describe the data pre-processing steps and the implementation details used in our
experiments.</p>
      <p>For pre-processing, we first removed extra spaces from the text and kept only those samples that had
enough content: abstracts with at least 11 tokens and highlights with at least 14 tokens. To keep the
data consistent and easier to train on, we set a maximum length of 512 tokens for abstracts and 100
tokens for generated highlights.</p>
      <p>For fine-tuning, we used the T5-base 1 and BART-base2 pre-trained language models from Hugging
Face. We fine-tuned them on the given Scihigh-MiXSub dataset for 5 epochs, with a batch size of 8 and
a learning rate of 2e-5. These settings were chosen to achieve good performance while maintaining
stability during training.</p>
    </sec>
    <sec id="sec-6">
      <title>7. Results</title>
      <p>In this section, we present the performance of the fine-tuned T5-base and BART-base models on the
highlight generation task. Table 1 shows the F1-scores (%) for ROUGE, METEOR, BERTScore, and
SciBERTScore obtained by both fine-tuned models on the test set. ROUGE and METEOR focus on
n-gram overlap and sentence-level alignment between the generated and reference highlights, while
BERTScore and SciBERTScore capture semantic similarity using contextual embeddings. The fine-tuned
T5-base model achieved the best results on all metrics except BERTScore and SciBERTScore, where
BART-base performed slightly better.</p>
      <p>Table 2 presents the ROUGE-L F1 score of all participating teams in the SciHigh track at FIRE 2025,
where our team, The NLP Explorers, achieved a ROUGE-L F1 score of 22.94% and secured the 5th position.</p>
      <sec id="sec-6-1">
        <title>7.1. Case Study</title>
        <p>To better understand the quality of the generated highlights, we present a case study in Figure 2. This
shows an example in which the author-written highlight is compared with the highlight generated by
1https://huggingface.co/google/t5-base
2https://huggingface.co/facebook/bart-base</p>
        <p>Group Name</p>
        <sec id="sec-6-1-1">
          <title>Text_highlights_gen</title>
        </sec>
        <sec id="sec-6-1-2">
          <title>AiNauts</title>
        </sec>
        <sec id="sec-6-1-3">
          <title>SVNIT_CSE</title>
        </sec>
        <sec id="sec-6-1-4">
          <title>NLPFusion</title>
          <p>The NLP Explorers</p>
        </sec>
        <sec id="sec-6-1-5">
          <title>NIT_PATNA_2025</title>
        </sec>
        <sec id="sec-6-1-6">
          <title>MUCS</title>
          <p>JU_CSE_PR_KS</p>
        </sec>
        <sec id="sec-6-1-7">
          <title>SCaLAR</title>
        </sec>
        <sec id="sec-6-1-8">
          <title>Ayanika</title>
        </sec>
        <sec id="sec-6-1-9">
          <title>Shilpo</title>
          <p>TJP
run1
run1
run1
run2
run2
run1
run1
run1
run1
run1
run1
run1
the fine-tuned T5-base model and BART-base model. This comparison helps illustrate how closely the
generated text matches the style and content of the reference highlight.</p>
          <p>From the case study, it can be seen that the fine-tuned T5-base model captures the main idea of the
author-written highlight while using slightly diferent wording. The generated highlights correctly
identify the role of csGRP78 as a molecular chaperone, its surface expression on cancer and angiogenic
endothelial cells, and its function as a promising biomarker and therapeutic target. These align closely
with the central ideas presented in the author-written highlights, showing that the T5-base model is
capable of extracting and paraphrasing domain-specific information with notable accuracy. However,
while the generated highlights successfully convey factual precision and topical relevance, they exhibit
a narrower scope by omitting certain advanced details such as the integration of targeting moieties
into nanoparticles, diferential impacts of antibody epitopes, and recommendations for future clinical
translation. This suggests that the model tends to prioritize high-level biomedical concepts over nuanced
mechanistic or experimental insights.</p>
          <p>In comparison, the fine-tuned BART-base model identifies relevant biomedical concepts, but its first
two generated points do not directly align with the author-written highlights. Instead, they draw
on broader and more descriptive ideas from the abstract, such as the general challenges of targeting
heterogeneous cancer cells and the detailed molecular role of csGRP78 within the heat shock protein
family. Although these points are factually accurate, they represent background information rather
than the focused, actionable insights emphasized by the authors. The third generated point, however,
directly aligns with the author-written highlights by correctly identifying the overexpression of GRP78
on the surface of cancer and angiogenic endothelial cells. Moreover, while BART-base partially captures
the significance of csGRP78 surface expression, it does not fully articulate the biomarker or therapeutic
implications with the same level of specificity demonstrated by the T5-base model.</p>
          <p>Overall, the comparative analysis suggests that T5-base produces highlights that more closely align
with the author’s intended summary structure and biomedical emphasis, whereas BART-base tends
to generate broader, more explanatory content. Both models exhibit strong semantic understanding,
but T5-base demonstrates greater fidelity to the concise, insight-focused nature of scientific highlights,</p>
        </sec>
        <sec id="sec-6-1-10">
          <title>Abstract:“As one of the deadliest diseases cancer frequently resists existing therapeutics because they</title>
          <p>do not target all cells within a progressing tumor for example both tumor stem and proliferating cells.</p>
        </sec>
        <sec id="sec-6-1-11">
          <title>This frequently results in enrichment of invasive and metastatic drug resistant tumor cells subpopulations</title>
          <p>cancer recurrence and eventually patient mortality. Thus there is an urgent need to identify specific
markers by which the targeted imaging and or therapeutic guided missile like agents can specifically
detect and or eradicate all cancer cells within a heterogeneous tumor while leaving the normal cells intact.
As a member of heat shock protein 70 superfamily glucose regulated protein 78 has been documented
as a molecular chaperone in the endoplasmic reticulum which mainly responds to ER stresses in normal
cells. There is over expression of GRP78 on the surface of cancer cells and angiogenic endothelial cells
which makes it a promising target for diferent types of peptides and antibodies that can be employed
for targeted cancer therapy or imaging. In this review we discuss the biological processes functional
importance and translocation mechanisms of cell surface GRP78 in tumor cells. As a cancer biomarker we
also review the potential applications of csGRP78 targeted therapy and imaging and finally we suggest a
brief roadmap ahead of csGRP78 targeting for targeted theranostic implications.”
Author-written highlight:
▶ csGRP78 is expressed principally by cancer cells and angiogenic endothelial cells.
▶ csGRP78 is a promising cancer biomarker and target for imaging and therapy.
▶ csGRP78 targeting moieties can be integrated into nanoparticles.
▶
▶</p>
        </sec>
        <sec id="sec-6-1-12">
          <title>Anti csGRP78 antibodies targeting diferent epitopes difer in therapeutic impact.</title>
        </sec>
        <sec id="sec-6-1-13">
          <title>New generation csGRP78 targeting agents should encourage clinical testing.</title>
          <p>Fine-tuned T5-base model generated highlight:
▶ csGRP78 is a molecular chaperone in the endoplasmic reticulum.
▶</p>
        </sec>
        <sec id="sec-6-1-14">
          <title>GRP78 is expressed on the surface of cancer cells and angiogenic endothelial cells.</title>
          <p>▶ csGRP78 is a promising target for diferent types of peptides and antibodies.</p>
          <p>▶ csGRP78 is a potential cancer biomarker for targeted therapy and imaging.</p>
          <p>Fine-tuned BART-base model generated highlight:
▶ Cancer is a heterogeneous disease and it is dificult to target all cancer cells within a tumor while
leaving the normal cells intact.
▶ As a member of heat shock protein 70 superfamily glucose regulated protein 78 has been documented
as a molecular chaperone in the endoplasmic reticulum which mainly responds to ER stresses in
normal cells.
▶ There is over expression of GRP78 on the surface of cancer cells and angiogenic endothelial cells
while BART-base reflects a tendency toward verbose, context-heavy summarization.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>8. Conclusion and Future Scope</title>
      <p>In this paper, we explored the task of generating research paper highlights directly from abstracts using
the pre-trained T5-base and BART-base models. The models were fine-tuned on the SciHigh Track FIRE
2025–provided SciHigh-MixSub dataset and evaluated using standard summarization metrics such as
ROUGE, METEOR, BERTScore, and SciBERTScore. The experimental results show that these models are
capable of generating highlights that are semantically relevant and stylistically close to author-written
highlights. Both the quantitative evaluation and the qualitative case study confirm the potential of
using transformer-based models for highlight generation, thereby reducing the manual efort required
by researchers and publishers.</p>
      <p>Although the results are promising, there are several directions for future research. First, more
advanced transformer models such as PEGASUS or LLaMA could be explored and compared with
T5base and BART-base to assess improvements in highlight quality. Second, incorporating domain-specific
pre-training, especially for scientific articles in fields such as medicine or computer science, may improve
relevance and factual accuracy. Third, human evaluation could be included alongside automatic metrics
to better assess the usefulness of generated highlights for end users. Finally, developing lightweight
and energy-eficient approaches will be important to address the environmental concerns associated
with training large-scale models.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>Generative AI tools were employed solely to aid in language polishing and formatting for specific
sections of this manuscript. All scientific content, experimental design, data collection, analysis, and
interpretation were independently developed and verified by the author(s). The AI tools did not
participate in experiment planning, coding, data processing, or drawing conclusions.
Document Processing (SDP 2022) collocated with COLING 2022, Association for Computational
Linguistics, Gyeongju, Republic of Korea, 2022, pp. 163–169.
[13] T. Rehman, D. K. Sanyal, S. Chattopadhyay, Research highlight generation with elmo contextual
embeddings, Scalable Computing: Practice and Experience 24 (2023) 181–190.
[14] T. Rehman, D. K. Sanyal, S. Chattopadhyay, P. K. Bhowmick, P. P. Das, Generation of highlights
from research papers using pointer-generator networks and scibert embeddings, IEEE Access 11
(2023) 91358–91374. doi:10.1109/ACCESS.2023.3292300.
[15] M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov, L. Zettlemoyer,
Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation,
and comprehension, in: Proceedings of the 58th annual meeting of the association for
computational linguistics, 2020, pp. 7871–7880.
[16] C.-Y. Lin, ROUGE: A package for automatic evaluation of summaries, in: Text Summarization</p>
      <p>Branches Out, Association for Computational Linguistics, Barcelona, Spain, 2004, pp. 74–81.
[17] S. Banerjee, A. Lavie, METEOR: An automatic metric for MT evaluation with improved correlation
with human judgments, in: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation
Measures for Machine Translation and/or Summarization, 2005, pp. 65–72.
[18] T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, Y. Artzi, BERTScore: Evaluating text generation
with BERT, in: 8th International Conference on Learning Representations, (ICLR 2020), 2020, pp.
1–43.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>H. P.</given-names>
            <surname>Luhn</surname>
          </string-name>
          ,
          <article-title>The automatic creation of literature abstracts</article-title>
          ,
          <source>IBM Journal of Research and Development</source>
          <volume>2</volume>
          (
          <year>1958</year>
          )
          <fpage>159</fpage>
          -
          <lpage>165</lpage>
          . doi:
          <volume>10</volume>
          .1147/rd.22.0159.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>See</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. J.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          ,
          <article-title>Get to the point: Summarization with pointer-generator networks</article-title>
          ,
          <source>in: Proc. 55th ACL</source>
          , Vol.
          <volume>1</volume>
          :
          <string-name>
            <given-names>Long</given-names>
            <surname>Papers</surname>
          </string-name>
          ,
          <year>2017</year>
          , pp.
          <fpage>1073</fpage>
          -
          <lpage>1083</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          , Ł. Kaiser,
          <string-name>
            <surname>I. Polosukhin</surname>
          </string-name>
          ,
          <article-title>Attention is all you need</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>30</volume>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>C.</given-names>
            <surname>Rafel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Roberts</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Narang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Matena</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. J.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <article-title>Exploring the limits of transfer learning with a unified text-to-text transformer</article-title>
          ,
          <source>Journal of Machine Learning Research</source>
          <volume>21</volume>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>67</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ghazvininejad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mohamed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          , L. Zettlemoyer, BART:
          <article-title>Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension</article-title>
          , in: D.
          <string-name>
            <surname>Jurafsky</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Chai</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Schluter</surname>
          </string-name>
          , J. Tetreault (Eds.),
          <article-title>Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics</article-title>
          , Online,
          <year>2020</year>
          , pp.
          <fpage>7871</fpage>
          -
          <lpage>7880</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2020</year>
          .acl-main.
          <volume>703</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>T.</given-names>
            <surname>Rehman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Das</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. K.</given-names>
            <surname>Sanyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chattopadhyay</surname>
          </string-name>
          ,
          <article-title>Abstractive text summarization using attentive gru based encoder-decoder</article-title>
          ,
          <source>in: Applications of Artificial Intelligence and Machine Learning: Select Proceedings of ICAAAIML 2021</source>
          , Springer,
          <year>2022</year>
          , pp.
          <fpage>687</fpage>
          -
          <lpage>695</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>T.</given-names>
            <surname>Rehman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Das</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. K.</given-names>
            <surname>Sanyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chattopadhyay</surname>
          </string-name>
          ,
          <article-title>An analysis of abstractive text summarization using pre-trained models</article-title>
          ,
          <source>in: Proceedings of International Conference on Computational Intelligence, Data Science and Cloud Computing: IEM-ICDC 2021</source>
          , Springer,
          <year>2022</year>
          , pp.
          <fpage>253</fpage>
          -
          <lpage>264</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bhattacharya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Rehman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. K.</given-names>
            <surname>Sanyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chattopadhyay</surname>
          </string-name>
          ,
          <article-title>Comparative analysis of abstractive summarization models for clinical radiology reports</article-title>
          ,
          <source>arXiv preprint arXiv:2506.16247</source>
          (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>E.</given-names>
            <surname>Collins</surname>
          </string-name>
          , I. Augenstein,
          <string-name>
            <given-names>S.</given-names>
            <surname>Riedel</surname>
          </string-name>
          ,
          <article-title>A supervised approach to extractive summarisation of scientific papers</article-title>
          ,
          <source>in: Proc. 21st Conf. on Computational Natural Language Learning (CoNLL</source>
          <year>2017</year>
          ), ACL, Vancouver, Canada,
          <year>2017</year>
          , pp.
          <fpage>195</fpage>
          -
          <lpage>205</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>L.</given-names>
            <surname>Cagliero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. La</given-names>
            <surname>Quatra</surname>
          </string-name>
          ,
          <article-title>Extracting highlights of scientific articles: A supervised summarization approach</article-title>
          ,
          <source>Expert Systems with Applications</source>
          <volume>160</volume>
          (
          <year>2020</year>
          )
          <fpage>113659</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>T.</given-names>
            <surname>Rehman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. K.</given-names>
            <surname>Sanyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chattopadhyay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Bhowmick</surname>
          </string-name>
          ,
          <string-name>
            <surname>P. P. Das</surname>
          </string-name>
          ,
          <article-title>Automatic generation of research highlights from scientific</article-title>
          , in: 2nd Workshop on Extraction and
          <article-title>Evaluation of Knowledge Entities from Scientific Documents (EEKE</article-title>
          <year>2021</year>
          ),
          <source>collocated with JCDL</source>
          <year>2021</year>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>T.</given-names>
            <surname>Rehman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. K.</given-names>
            <surname>Sanyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chattopadhyay</surname>
          </string-name>
          ,
          <article-title>Named entity recognition based automatic generation of research highlights</article-title>
          ,
          <source>in: Proceedings of the Third Workshop on Scholarly</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>