<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Automatic Scientific Summarization: A Neural Approach to Research Highlight Generation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ayanika Samanta</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tohida Rehman</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Jadavpur University</institution>
          ,
          <addr-line>Kolkata</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Techno India University</institution>
          ,
          <addr-line>Kolkata</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2026</year>
      </pub-date>
      <abstract>
        <p>With the rapid surge in scientific publications, researchers and indexing platforms increasingly require reliable tools that can condense complex studies into clear and accessible summaries. Research highlights play a particularly important role because they capture the core contributions of a paper in a more focused and digestible form than traditional abstracts. In this work, we introduce a shared task that builds on the previously developed MixSub dataset to automatically generate research highlights from the abstracts of scientific articles. Our objective is to improve the clarity, usefulness, and accuracy of machine-generated highlights so they can better assist academic search and retrieval systems. To explore this task, we fine-tuned transformer-based models, including T5, and evaluated their performance on the shared benchmark. In the SciHigh track at FIRE 2025, we the team Ayanika secured the tenth position with a ROUGE-L F1 score of 17.91%.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Text Summarization</kwd>
        <kwd>Natural Language Processing</kwd>
        <kwd>Pre-trained Language Model</kwd>
        <kwd>Evaluation Metrics</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        1. We fine-tuned pre-trained transformer models, particularly the T5 text-to-text architecture,
to generate structured research highlights and demonstrate their ability to condense lengthy
scientific abstracts into concise, meaningful points.
2. We evaluate model performance using widely accepted metrics such as ROUGE [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and METEOR
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], providing a comparative analysis of highlight-generation quality.
mitosis in which centrosome functions as an electronic generator . In particular the spinal rotations of
centrioles transform the cellular chemical energy into cellular electromagnetic energy .
The model is strongly supported by multiple experimental evidences. It ofers an elegant explanation
for the self organized orthogonal configuration of the two centrioles in a centrosome that is through
the dynamic electromagnetic interactions of both centrioles of the centrosome. ”
Author-written research highlights:
▶ “We provide a model to describe centrosome function in correlation with its structural organizations.”
▶ “ We suggested electromagnetic field is the missing link for centrosome function during mitosis.”
▶ “ceWnteroosfeormede.”physical explanations for the orthogonal self organization structural features of
▶ “We provided multiple detailed evidences to support the electromagnetic model we built for
centrosome function.”
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        The rapid expansion of digital information has made automatic text summarization an essential tool for
managing and understanding large volumes of content. Early work in summarization relied on statistical
and heuristic methods, which selected sentences based on cues such as term frequency, sentence
position, or structural markers. As research progressed, extractive systems evolved to incorporate
more sophisticated strategies. F To choose the most central sentences, graph-based techniques such
as TextRank [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] used algorithms similar to PageRank, in which sentences serve as nodes, and edges
denote semantic similarity.
      </p>
      <p>
        With the advent of deep learning, summarization shifted from simple extraction toward more fluent
generation. Abstractive approaches, unlike extractive ones, aim to produce new sentences that capture
the underlying meaning of the source text. Transformer-based models such as T5 [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], BART [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], and
PEGASUS [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] have further advanced summarization by leveraging self-attention mechanisms and
large-scale pretraining, allowing them to better capture semantic relations and long-range
dependencies. Sentence embeddings were greatly enhanced by BERT (Bidirectional Encoder Representations
from Transformers) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], which was pre-trained on sizable corpora using masked language modeling.
BERTSUM [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] refined BERT to choose important phrases for summaries by adapting its contextual
representations for extractive summarizing, despite the fact that BERT was not generative by nature.
Abstractive summarization [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], in contrast, aims to construct new phrases that imitate the underlying
content, resembling human-written summaries. Templates and language rules were used in early
abstractive systems, but these techniques were fragile and domain specific [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Abstractive
summarization was made possible by the emergence of sequence-to-sequence models with attention [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], which
learned mappings between input and output sequences. Nevertheless, long-short-term memory (LSTM)
and recurrent neural network (RNN) models frequently generated repeated or insuficient summaries
and had trouble handling lengthy dependencies. The transformer design [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] later overcame these
drawbacks and served as the basis for contemporary abstractive summarization.
      </p>
      <p>
        In recent years, attention has increasingly turned toward scientific summarization, which presents
unique challenges due to domain-specific vocabulary, technical phrasing, and the need for factual
precision. Several studies explore models designed specifically for scientific writing. Rezapour et al.
[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] propose a two-stage system that uses structured document representations enriched with scientific
graph information, improving both content selection and coherence for long scientific texts.
      </p>
      <p>
        Rehman et al. [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] used a GRU-based encoder-decoder with Bahdanau attention to build an English
text summarizer trained on a news-summary dataset, achieving improved performance for generating
concise summaries suitable as headlines. Rehman et al. [15] evaluated pre-trained models such as
Pegasus-CNN-DailyMail, T5-base, and BART-large-CNN for summarization across datasets including
CNN-DailyMail, SAMSum, and BillSum.
      </p>
      <p>Generating research highlights short bullet points emphasizing key contributions—has emerged as
a specialized task within scientific summarization. Early approaches include supervised extractive
models [16] and regression-based methods [17], with datasets like CSPubSum, AIPubSumm, and
BioPubSumm supporting evaluation. Rehman et al. [18] developed an abstractive pointer-generator
model with GloVe, later enhanced with named entity recognition [19], ELMo embeddings [20, 21], and
SciBERT with coverage mechanisms, and introduced the multi-domain MixSub dataset [22].</p>
      <p>Overall, the literature reflects steady progress across extractive and abstractive techniques, the
adoption of transformer architectures, and growing interest in scientific summarization and
highlightgeneration tasks. Despite this progress, persistent challenges remain, including improving factual
consistency, enhancing abstraction, and developing evaluation metrics that more accurately reflect the
needs of scientific communication.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>This section presents the transformer-based models considered for fine-tuning on the highlight
generation task. Their architectural characteristics and parameter scales are outlined below, and Figure 2
provides an overview of the processing framework.</p>
      <p>1. T5 Family of Models</p>
      <p>
        The T5 [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] architecture (Text-to-Text Transfer Transformer) frames every natural language task
as a sequence-generation problem . Both the input and the output are treated as text strings,
which allows the model to operate within one unified design across multiple applications such as
classification, summarization, translation, and question answering. The framework is built on
an encoder–decoder transformer, where the encoder produces contextual embeddings and the
decoder generates the target sequence autoregressively.
      </p>
      <p>The T5 model family comes in multiple sizes, from the lightweight T5-small with 60 million
parameters to T5-base (220M), T5-large (770M), and the high-capacity T5-3B and T5-11B. Each
variant increases the number of encoder and decoder layers, hidden dimensions, and attention
heads, providing progressively greater representational power. All versions follow the same
architectural design, allowing a trade-of between performance and computational requirements.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental Setup</title>
      <p>In this section, we discussed the dataset provided for the SciHigh shared task, outline the pre-processing
and implementation details, and the evaluation metrics used.</p>
      <sec id="sec-4-1">
        <title>4.1. Dataset</title>
        <p>The experiments in this study rely on the MixSub-SciHigh dataset introduced by Rehman et al.[22].
This dataset pairs scientific abstracts with the highlights written by the original authors, making it
well suited for training models that aim to generate concise research contributions. In total, it contains
19,785 research articles collected mainly from ScienceDirect and other academic publishers from the
year 2020. Each record includes an abstract and its corresponding set of highlights, ofering a clear
mapping between long-form scientific text and its condensed representation.</p>
        <p>Typically, the dataset is divided into training, validation, and test sets using an 80:10:10 split. For
the FIRE 2025 SciHigh shared task, a prepared version of the dataset was released, consisting of 10,000
samples for training, 1,985 for validation, and 1,840 for testing. This curated collection serves as the core
resource for evaluating systems designed to automatically produce research highlights from abstracts.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Data Pre-processing</title>
        <p>Before model training, several preprocessing steps are applied to ensure that the input text is clean,
consistent, and suitable for transformer-based architectures. The process includes the following
components:
1. Data Cleaning: Removal of extraneous characters, incomplete entries, and formatting
inconsistencies to ensure reliable inputs.
2. Tokenization: The text is segmented into sentences and tokens using NLTK, allowing the
transformer encoder to process the input eficiently.
3. Normalization: Standardization steps such as lowercasing, trimming excess whitespace, and
reducing redundant punctuation help maintain uniformity across samples.
4. Abstract–Highlight Alignment: Each abstract is paired with its corresponding author-provided
highlight to create a clear one-to-one mapping for training and evaluation.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Implementation Details</title>
        <p>All model development and experimentation were carried out using the Hugging Face Transformers
library. The primary model used in this study was t5-small1, chosen for its eficiency and suitability
for highlight-style summarization.</p>
        <p>The model was trained for three epochs on the MixSub-SciHigh dataset. The maximum input length
was fixed at 256 tokens, while the generated highlights were limited to 30 tokens to maintain concise
summaries. These settings ensured consistent training and avoided unnecessary truncation. The model
was trained with a learning rate of 2e-5.</p>
        <p>All experiments were executed on Google Colab using an NVIDIA T4 GPU, which was suficient for
full training and evaluation with dynamic padding and standard batching strategies.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Evaluation Metrics</title>
        <p>
          To assess the quality of the automatically generated research highlights, we employ two widely used
metrics in summarization research: ROUGE and METEOR. Both metrics compare system-generated
summaries with human-written references, but they capture diferent aspects of summary quality.
ROUGE emphasizes lexical overlap, while METEOR incorporates semantic matching and word-order
sensitivity.
4.4.1. ROUGE
ROUGE (Recall-Oriented Understudy for Gisting Evaluation) [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] is a standard evaluation metric for
summarization tasks. It measures how much of the reference content is captured by the generated
highlight by computing n-gram overlaps. In this work, we report ROUGE-1, ROUGE-2, and ROUGE-L.
ROUGE-1 evaluates unigram overlap, ROUGE-2 captures bigram overlap, and ROUGE-L measures the
longest common subsequence between the two summaries.
        </p>
        <p>Let  denote the number of overlapping n-grams, gen the total number of n-grams in the generated
highlight, and ref the total number in the reference highlight. Precision and recall are computed as
shown in Equation 1, while the F1-score can be calculated as per the Equation 2.
(5)
(1)
(2)
(3)
(4)
Precision =
F1-score =</p>
        <p>gen
;</p>
        <p>Recall =

ref
2 × Precision × Recall</p>
        <p>Precision + Recall</p>
        <p>
          ROUGE focuses on lexical similarity, providing insight into how well the generated highlight captures
the key terms and phrases of the reference summary.
4.4.2. METEOR
METEOR (Metric for Evaluation of Translation with Explicit Ordering) [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] ofers a complementary
perspective by incorporating semantic matching and ordering constraints. Unlike ROUGE, which relies
strictly on n-gram overlap, METEOR aligns unigrams using exact matches, stemming, and synonyms,
enabling a more meaning-oriented evaluation. It also applies a fragmentation penalty to account for
disordered or scattered matches.
        </p>
        <p>Let  represent unigram precision,  represent unigram recall, and let matched unigrams be grouped
into ordered chunks. The mean score, fragmentation penalty, and final METEOR score are computed
using Equations 3, 4, and 5.</p>
        <p>mean =
10 ×  × 
 + (9 ×  )
Penalty = 0.5 ×
︂(</p>
        <p>#chunks
#matched_unigrams</p>
        <p>︂) 3</p>
        <p>METEOR = mean × (1 − Penalty)</p>
        <p>METEOR provides a more semantically sensitive evaluation by rewarding synonym matches and
penalizing disordered alignments, making it well suited for assessing highlight-generation quality.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>As shown in Table 2, our team Ayanika achieved a ROUGE-L F1 score of 17.91%, placing them at the
10th position among all participating teams.</p>
      <sec id="sec-5-1">
        <title>5.1. Case Study</title>
        <p>To better understand the quality of the generated highlights, we present a case study in Figure 3. This
shows an example in which the author-written highlight is compared with the highlight generated by
the T5-small model. The case study reveals that the T5-small model correctly captures several core
elements from the abstract. Specifically, it identifies key ideas such as “dropout is commonly used to
reduce overfitting”, “fixed drop probability”, and “performance degradation when dropout is applied
extensively”.</p>
        <p>However, the generated highlight remains largely extractive and lacks the concise,
contributionfocused expressions found in the author-written highlights. Crucial ideas such as “surrogate dropout,
per-neuron drop rate,” and “superior regularization performance across datasets” are not captured by
the model.</p>
        <p>Group Name
Text_highlights_gen
AiNauts
SVNIT_CSE
NLPFusion
The NLP Explorers
NIT_PATNA_2025
MUCS
JU_CSE_PR_KS
SCaLAR
Ayanika
run1
run1
run1
run2
run2
run1
run1
run1
run1
run1</p>
        <p>Overall, the comparison indicates that while the T5-small model can identify important surface-level
information, it struggles to produce abstracted, contribution-oriented highlights. This highlights the
need for improved fine-tuning strategies that strengthen abstraction, compression, and emphasis on
novel contributions.</p>
        <p>Abstract: “Dropout is commonly used in deep neural networks to alleviate the problem of overfitting.
Conventionally the neurons in a layer indiscriminately share a fixed drop probability which results in
dificulty in determining the appropriate value for diferent tasks. Moreover this static strategy will also
incur serious degradation on performance when the conventional dropout is extensively applied to both
shallow and deep layers. A question is whether selectively dropping the neurons would realize a better
regularization efect. This paper proposes a simple and efective surrogate dropout method whereby
neurons are dropped according to their importance. The proposed method has two main stages. The first
stage trains a surrogate module that can be jointly optimized along with the neural network to evaluate
the importance of each neuron. In the second stage the output of the surrogate module is regarded as a
guidance signal for dropping certain neurons approximating the optimal per neuron drop rate when the
network converges. Various convolutional neural network architectures and multiple datasets including
CIFAR 10 CIFAR 100 SVHN Tiny ImageNet and two medical image datasets are used to evaluate the
surrogate dropout method. The experimental results demonstrate that the proposed method achieves a
better regularization efect than the baseline methods.</p>
        <p>Author-written highlight: A simple and efective regularization method called surrogate dropout is
proposed which regards the surrogate module as a proxy for approximating the optimal drop rate of
each neuron. Compared with conventional dropout the surrogate dropout method has fewer restrictions.
Both the shallow and deep layers in CNNs can benefit from the usage of surrogate dropout. The superior
regularization efect of surrogate dropout has been empirically verified using multiple datasets and
networks with various depths.</p>
        <p>Fine-tuned T5-small model generated highlight: Dropout is commonly used in deep neural networks
to alleviate the problem of overfitting. Conventionally the neurons in a layer indiscriminately share a
fixed drop probability which results in dificulty in determining the appropriate value for diferent tasks.
Moreover this static strategy will also incur serious degradation on performance when the conventional
dropout is extensively applied.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions and Future Work</title>
      <p>Our team Ayanika used a transformer-based, fine-tuned T5 model for generating research highlights
and secured the 10th position in the SciHigh track. The model was able to convert lengthy scientific
abstracts into concise and well-structured highlights, demonstrating the efectiveness of T5 as a strong
baseline for this task.</p>
      <p>However, the approach has certain limitations, including restricted dataset coverage and the
computational overhead of fine-tuning T5 for domain-specific summarization. Future improvements may
include expanding the dataset to cover more research areas, enhancing factual consistency, and
integrating human feedback or richer semantic evaluation metrics to further refine the quality of generated
highlights.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>The author(s) used Generative AI tools solely for grammar and spelling checks. All experimental work
and analyses were carried out independently by the author(s).
gru based encoder-decoder, in: Applications of Artificial Intelligence and Machine Learning: Select
Proceedings of ICAAAIML 2021, Springer, 2022, pp. 687–695.
[15] T. Rehman, S. Das, D. K. Sanyal, S. Chattopadhyay, An analysis of abstractive text
summarization using pre-trained models, in: Proceedings of International Conference on Computational
Intelligence, Data Science and Cloud Computing: IEM-ICDC 2021, Springer, 2022, pp. 253–264.
[16] E. Collins, I. Augenstein, S. Riedel, A supervised approach to extractive summarisation of scientific
papers, in: Proc. 21st Conf. on Computational Natural Language Learning (CoNLL 2017), ACL,
Vancouver, Canada, 2017, pp. 195–205.
[17] L. Cagliero, M. La Quatra, Extracting highlights of scientific articles: A supervised summarization
approach, Expert Systems with Applications 160 (2020) 113659. URL: https://www.sciencedirect.
com/science/article/abs/pii/S0957417420304838.
[18] T. Rehman, D. K. Sanyal, S. Chattopadhyay, P. K. Bhowmick, P. P. Das, Automatic generation of
research highlights from scientific, in: 2nd Workshop on Extraction and Evaluation of Knowledge
Entities from Scientific Documents (EEKE’21), collocated with JCDL’21, 2021.
[19] T. Rehman, D. K. Sanyal, P. Majumder, S. Chattopadhyay, Named entity recognition based automatic
generation of research highlights, in: Proceedings of the Third Workshop on Scholarly Document
Processing (SDP 2022) collocated with COLING 2022, Association for Computational Linguistics,
Gyeongju, Republic of Korea, 2022, pp. 163–169. URL: https://aclanthology.org/2022.sdp-1.18.
[20] T. Rehman, D. K. Sanyal, S. Chattopadhyay, Research highlight generation with elmo contextual
embeddings, Scalable Computing: Practice and Experience 24 (2023) 181–190.
[21] M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, L. Zettlemoyer, Deep
contextualized word representations, in: M. Walker, H. Ji, A. Stent (Eds.), Proceedings of the 2018
Conference of the North American Chapter of the Association for Computational Linguistics:
Human Language Technologies, Volume 1 (Long Papers), Association for Computational
Linguistics, New Orleans, Louisiana, 2018, pp. 2227–2237. URL: https://aclanthology.org/N18-1202.
doi:10.18653/v1/N18-1202.
[22] T. Rehman, D. K. Sanyal, S. Chattopadhyay, P. K. Bhowmick, P. P. Das, Generation of highlights
from research papers using pointer-generator networks and scibert embeddings, IEEE Access 11
(2023) 91358–91374. doi:10.1109/ACCESS.2023.3292300.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C.-Y.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <surname>ROUGE:</surname>
          </string-name>
          <article-title>A package for automatic evaluation of summaries, in: Text Summarization Branches Out, Association for Computational Linguistics</article-title>
          , Barcelona, Spain,
          <year>2004</year>
          , pp.
          <fpage>74</fpage>
          -
          <lpage>81</lpage>
          . URL: https://aclanthology.org/W04-1013/.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Banerjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lavie</surname>
          </string-name>
          ,
          <string-name>
            <surname>METEOR:</surname>
          </string-name>
          <article-title>An automatic metric for MT evaluation with improved correlation with human judgments</article-title>
          , in: J.
          <string-name>
            <surname>Goldstein</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Lavie</surname>
            ,
            <given-names>C.-Y.</given-names>
          </string-name>
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          Voss (Eds.),
          <source>Proceedings of the ACL Workshop</source>
          on Intrinsic and
          <article-title>Extrinsic Evaluation Measures for Machine Translation and/or Summarization, Association for Computational Linguistics</article-title>
          , Ann Arbor, Michigan,
          <year>2005</year>
          , pp.
          <fpage>65</fpage>
          -
          <lpage>72</lpage>
          . URL: https://aclanthology.org/W05-0909/.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R.</given-names>
            <surname>Mihalcea</surname>
          </string-name>
          , P. Tarau,
          <article-title>TextRank: Bringing order into text</article-title>
          , in: D.
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          Wu (Eds.),
          <source>Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing</source>
          , Association for Computational Linguistics, Barcelona, Spain,
          <year>2004</year>
          , pp.
          <fpage>404</fpage>
          -
          <lpage>411</lpage>
          . URL: https://aclanthology.org/ W04-3252/.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>C.</given-names>
            <surname>Rafel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Roberts</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Narang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Matena</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. J.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <article-title>Exploring the limits of transfer learning with a unified text-to-text transformer</article-title>
          ,
          <source>J. Mach. Learn. Res</source>
          .
          <volume>21</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ghazvininejad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mohamed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          , L. Zettlemoyer, BART:
          <article-title>Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension</article-title>
          , in: D.
          <string-name>
            <surname>Jurafsky</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Chai</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Schluter</surname>
          </string-name>
          , J. Tetreault (Eds.),
          <article-title>Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics</article-title>
          ,
          <string-name>
            <surname>ACL</surname>
          </string-name>
          , Online,
          <year>2020</year>
          , pp.
          <fpage>7871</fpage>
          -
          <lpage>7880</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2020</year>
          .acl-main.
          <volume>703</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Saleh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. J.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <article-title>Pegasus: pre-training with extracted gap-sentences for abstractive summarization</article-title>
          ,
          <source>in: Proceedings of the 37th International Conference on Machine Learning, ICML'20</source>
          , JMLR.org,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , BERT:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          , in: J.
          <string-name>
            <surname>Burstein</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Doran</surname>
          </string-name>
          , T. Solorio (Eds.),
          <source>Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>1</volume>
          (Long and Short Papers),
          <source>Association for Computational Linguistics</source>
          , Minneapolis, Minnesota,
          <year>2019</year>
          , pp.
          <fpage>4171</fpage>
          -
          <lpage>4186</lpage>
          . URL: https://aclanthology.org/N19-1423/. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>N19</fpage>
          -1423.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lapata</surname>
          </string-name>
          ,
          <article-title>Text summarization with pretrained encoders</article-title>
          , in: K. Inui,
          <string-name>
            <given-names>J.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ng</surname>
          </string-name>
          ,
          <string-name>
            <surname>X.</surname>
          </string-name>
          Wan (Eds.),
          <source>Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLPIJCNLP)</source>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Hong Kong, China,
          <year>2019</year>
          , pp.
          <fpage>3730</fpage>
          -
          <lpage>3740</lpage>
          . URL: https://aclanthology.org/D19-1387/. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>D19</fpage>
          -1387.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>T.</given-names>
            <surname>Bao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , C. Zhang,
          <article-title>Enhancing abstractive summarization of scientific papers using structure information</article-title>
          ,
          <source>Expert Systems with Applications</source>
          <volume>261</volume>
          (
          <year>2025</year>
          )
          <article-title>125529</article-title>
          . URL: https://www. sciencedirect.com/science/article/pii/S0957417424023960. doi:https://doi.org/10.1016/j. eswa.
          <year>2024</year>
          .
          <volume>125529</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>H.</given-names>
            <surname>Jing</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. R.</given-names>
            <surname>McKeown</surname>
          </string-name>
          ,
          <article-title>Cut and paste based text summarization, in: 1st Meeting of the North American Chapter of the Association for Computational Linguistics</article-title>
          ,
          <year>2000</year>
          . URL: https: //aclanthology.org/A00-2024/.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>I.</given-names>
            <surname>Sutskever</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Vinyals</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q. V.</given-names>
            <surname>Le</surname>
          </string-name>
          ,
          <article-title>Sequence to sequence learning with neural networks</article-title>
          ,
          <source>ArXiv abs/1409</source>
          .3215 (
          <year>2014</year>
          ). URL: https://api.semanticscholar.org/CorpusID:7961699.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kaiser</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Polosukhin</surname>
          </string-name>
          , Attention is all you need,
          <year>2017</year>
          . URL: https://arxiv.org/pdf/1706.03762.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>R.</given-names>
            <surname>Rezapour</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ge</surname>
          </string-name>
          , K. Han,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jeong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Diesner</surname>
          </string-name>
          ,
          <article-title>Two-stage graph-augmented summarization of scientific documents</article-title>
          , in: L.
          <string-name>
            <surname>Peled-Cohen</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Calderon</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Lissak</surname>
          </string-name>
          , R. Reichart (Eds.),
          <source>Proceedings of the 1st Workshop on NLP for Science (NLP4Science)</source>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Miami, FL, USA,
          <year>2024</year>
          , pp.
          <fpage>36</fpage>
          -
          <lpage>46</lpage>
          . URL: https://aclanthology.org/
          <year>2024</year>
          .nlp4science-
          <fpage>1</fpage>
          .5/. doi:
          <volume>10</volume>
          . 18653/v1/
          <year>2024</year>
          .nlp4science-
          <fpage>1</fpage>
          .5.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>T.</given-names>
            <surname>Rehman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Das</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. K.</given-names>
            <surname>Sanyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chattopadhyay</surname>
          </string-name>
          ,
          <article-title>Abstractive text summarization using attentive</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>