<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Transformer-based models for generating research highlights from scientific articles</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Gowni Bhavishya</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Siba Sankar Sahu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Sardar Vallabhbhai National Institute of Technology</institution>
          ,
          <addr-line>Surat</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2026</year>
      </pub-date>
      <abstract>
        <p>Rapid growth of scientific publications makes it dificult for a researcher to identify key contributions. Abstracts usually present a broad summary, while highlights convey the novelty and key findings more precisely. As part of the FIRE 2025 SciHigh shared task, we explore automatic highlight generation, where the goal is to generate concise highlights from scientific articles. In this study, our team SVNIT_CSE explored diferent transformer-based models for highlight generation. Among the models, BART provide best performance and generate a similar highlight as provided in ground truth. The transformer-based model research highlights generation emphasizes their potential to facilitate faster and more targeted dissemination of scientific knowledge.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Research Highlights</kwd>
        <kwd>Highlight generation</kwd>
        <kwd>Scientific publications</kwd>
        <kwd>Transformer-based models</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        In recent decades, the rate of scientific publications has increased at an unprecedented rate, with
millions of papers being published annually across a variety of disciplines [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Rapid growth of scientific
publications is a challenging task for researchers, reviewers, and digital libraries to quickly identify
the essential contributions of each work. A scientific paper includes an abstract that provides a
comprehensive summary of the research work. However, abstracts are often too long, highly descriptive,
and stylistically diverse, making them less eficient for rapid screening in extensive literature searches.
Hence, a research highlight is essential that comprises brief bullet point statements and summarizes the
main contributions of a paper in a clear and structured manner. The research highlights ofer immediate
information, allowing the reader to assess the relevance information more eficiently. Moreover,
it supports digital libraries to improve indexing and retrieval. So far, the researcher has manually
generated highlights that have been additional work for authors and editors [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Hence, building
automated highlight generation systems is essential for research domain that have the capability to
produce concise, accurate, and stylistically consistent outputs.
      </p>
      <p>
        Diferent studies have been conducted on scientific summarization. In the early days, scientific
summarization focused primarily on extractive approaches that extract the most significant sentences
from a paper [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ]. The extractive approach is simple; but generate poor research highlights. Deep
learning models provide better research highlights by using abstractive approaches [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The deep
learning models have the capability to generate new sentences rather than simply re-create existing
ones.
      </p>
      <p>
        In recent years, transformer models have provided the best performance in diferent extractive and
abstractive summarizations [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ]. However, the transformer model is less explored in the research
highlight generation. In this study, we explore various transformer-based models to generate research
highlights. The explored model provides noticeable performance and outperforms existing models in
the generation of research highlights. The research highlights generated by the transformer models
are more concise, accurate, and stylistically comparable to those written by the authors. Moreover,
generated highlight balance conciseness with factual accuracy and rarely produce outputs that adhere
to the brief, bullet-style format expected of research highlights.
      </p>
      <sec id="sec-1-1">
        <title>1.1. Task Description</title>
        <p>The SciHigh 1 shared task was part of the FIRE 2 (Forum for Information Retrieval Evaluation) evaluation
campaign. The goal of the SciHigh shared task is to automatically generate research highlights from
scientific articles. In traditional summarization technique, the model generates a summary that is
paragraph-long, and it is very dificult to extract relevant information. The model only extracts
surfacelevel key words and provides a summary. The research highlight ofers a brief, bullet points and
the noteworthy and innovative contributions of a scientific paper. The research highlight focused on
creating extremely compressed and information-rich sentences that resemble the highlights of a research
paper. This makes the task particularly challenging and requires models to handle domain-specific
vocabulary, maintain factual accuracy, and strike a balance between conciseness and informativeness.
Highlight generation lies at the intersection of abstractive summarization and scientific information
retrieval, with practical applications in enhancing research discoverability.</p>
        <p>The remainder of the paper is organized as follows. In Section 2, we present the existing work in text
summarization and research highlight generation. Section 3 describes the statistics of the data set. The
model framework is presented in Section 4 followed by the experimental result in Section 5. Finally, we
conclude with future work in Section 6.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Text summarization is an important downstream task in the natural language processing domain.
Numerous studies have been conducted in high and low-resource languages [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ]. Abstractive
summarization research has made significant progress with the emergence of sequence-to-sequence models.
Nallapati et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] applied recurrent attentional encoder-decoder networks to abstractive
summarization. Rush et al. [11] and Chopra et al. [12] implemented sequence-to-sequence models with attention.
Diferent deep neural network models generate novel sentences, but often produce repetitive content
or leave out key information. To overcome these limitations, See et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] proposed a pointer-generator
network with coverage and achieved a balance between abstraction and the ability to copy technical
terms. Later, Anh and Trang [13] improved the model performance by adding Word2Vec and FastText
embeddings. These advances marked the emergence of neural abstractive summarization as a viable
alternative to extractive methods.
      </p>
      <p>
        Diferent researchers also investigated the impact of summarization techniques in the scientific
domain [14]. The adaptation of summarization in the scientific domain posed new challenges, including
technical vocabulary, long input lengths, and structured discourse. Nallapati et al. [15] introduced a new
corpus that comprises an introduction and an abstract section. They explored multiple-time-scale GRUs
in arXiv articles. Souza et al. [16] developed a multiview extractive method for scientific texts. Collins
et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] presented the CSPubSum dataset [17], which consists of more than 10,000 computer science
articles with highlights produced by the authors. Moreover, Cagliero and Quatra [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] refined highlight
extraction as a top k sentence selection. Rehman et al. [18] implemented pointer generator networks
with GloVe embeddings in CSPubSum [17] for scientific highlight generation. They demonstrated that
the proposed methods produced highlights of higher quality than extractive systems. Subsequently,
Rehman et al.[19] incorporated named entity recognition into pointer generator models to improve the
factual accuracy. They also explored the use of ELMO [20] contextual embeddings to improve semantic
representation.
      </p>
      <p>The transformer architecture [21] revolutionizes text summarization by enabling large-scale
pretraining. Models such as BERTSUM [22], BART [23], T5 [24], and PEGASUS [25] provide better</p>
      <sec id="sec-2-1">
        <title>1https://sites.google.com/jadavpuruniversity.in/scihigh2025/home 2https://fire.irsi.org.in/fire/2025/home</title>
        <p>abstractive summary. Beltagy et al. [26] presented SciBERT, a model pre-trained on academic texts,
to address domain-specific language, which enhanced performance of summarization and associated
downstream tasks. Most recently, Rehman et al. [19] combined pointer-generator networks with
SciBERT embeddings. They showed that abstract-only inputs paired with domain-specific embeddings
are suficient for highlight generation. These transformer-based approaches made the summaries more
lfuent and logical. They also made the facts more consistent and helped the summaries work better in
diferent domains. As a result, they set a new standard for summarization research and showed that
pre-training and fine-tuning methods work for natural language generation.</p>
        <p>From the above study, we found that various methods are investigated, such as statistical methods to
neural approaches in diferent domains, such as generic summarization to domain-specific adaptations.
However, existing models do not produce accurate, concise, and stylistically appropriate summaries.
In this study, we explore diferent transformer-based models such as BART, T5, LongT5, LED, and
PEGASUS that have been widely adopted for general and long-document summarization. These models
have not been explored to generate scientific research highlights, which requires highly compressed,
contribution-focused output. We evaluated the pre-trained transformer models on the research highlight
generation task to establish a baseline of performance and examine their suitability for automated
scientific communication.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Dataset</title>
      <p>
        In this task, we used the MixSub-SciHigh dataset, a subset of the MixSub corpus [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], which was
constructed from research articles published in ScienceDirect. The corpus comprises 19,785 articles in
various scientific domains, where each instance is represented as a pair consisting of the abstract of the
article and the research highlights written by the authors. The abstracts provide concise summaries of
the papers, while the highlights capture their key contributions in short bullet-style statements. The
statistic of the MixSub-SciHigh data set is shown in Table 1.
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Model Framework</title>
      <p>We investigated various transformer-based models for scientific highlight generation. All models are
evaluated in the MixSub-SciHigh dataset, in which abstracts served as the input and author-written
highlights served as the output. Experiments are conducted on Kaggle utilizing an NVIDIA Tesla P100
GPU. In the encoder-decoder models, we experiment with diferent sequential models and parameters,
and the best performance obtained at a given parameter setting is shown in Table 2. We briefly describe
the transformer models below.
4.1. BART
Lewis et al. [23] introduced a denoising autoencoder sequence-to-sequence model called BART
(Bidirectional Auto-Regressive Transformers). The transformer architecture combines a left-to-right
autoregressive decoder, such as GPT, with a bidirectional encoder, such as BERT. Pre-training involves
masking, deletion, and sentence permutation, followed by reconstruction. The architecture of BART is
designed in such a way that it ofers the best performance in summarization. We experimented with the
Facebook/bart-large-cnn3 variant on the MixSub-SciHigh data set. BART provides the best performance
and outperforms all evaluated models.
4.2. T5
4.3. LongT5
Rafel et al. [ 24] presented the T5 (Text-to-Text Transfer Transformer) model. The model comprises
an encoder–decoder architecture and reframes all tasks as text-to-text problems. T5 trained with a
‘span corruption’ denoising objective. The feature makes it flexible for applications like abstractive
summarization and highlight generation. T5 was trained in a large web text data set called the Colossal
Clean Crawled Corpus (C4) [24]. We experimented with the t5-base4 in the MixSub-SciHigh dataset.
The model provides a fluent and coherent output and ofers a longer and less precise summarization.
Guo et al. [27] proposed a variation of the T5 architecture called LongT5. The model has the capability to
handle longer input sequences. The conventional T5 model uses the self-attention mechanism, whereas
the local-global attention (LoGA) mechanism is incorporated into LongT5. The Local-Global Attention
(LoGA) mechanism helps to take a larger input sequence length. The model combines global tokens to
collect greater document-level context and local sliding-window attention for short-range dependencies,
thus striking a compromise between eficiency and contextual awareness. We experimented with
the Google/long-t5-tglobal-base5 variant on the MixSub-SciHigh dataset. LongT5 takes larger input
sequence lengths, but its advantage was limited by the input length.
4.4. LED
Beltagy et al. [28] introduced the longformer encoder-decoder (LED) model, which uses the longformer
architecture and is designed for sequence-to-sequence generation. LED combines local attention with
a small number of global attention tokens, whereas traditional transformer models fully rely on
selfattention. The model can handle token length up to 16,384 tokens, and make the model suitable for
long-document summarization. We experimented with the Allenai/led-base-163846 variant on the
MixSub-SciHigh dataset. The LED architecture is designed to take advantage of long input sequences,
but the limited input length in our experiments provides poor efectiveness.</p>
      <sec id="sec-4-1">
        <title>4.5. PEGASUS</title>
        <p>Zhang et al. [25] introduced the PEGASUS model. The model was pre-trained using a unique ‘gap
sentence generation’ and specifically designed for abstractive summarization. PEGASUS forces the
model to produce sentences from the surrounding context by masking complete sentences during
pre-training. PEGASUS-PubMed was pre-trained in biomedical corpora. We experimented with the
Google/pegasus-pubmed 7 variant on the MixSub-SciHigh dataset. PubMed performed well, but
struggled to generalize outside its biomedical domain. We experimented with diferent hyperparameters for
highlight generation. The best performance achieved by a transformer model in a given hyperparameter
is shown in Table 2. The model is trained up to five epochs at a learning rate of 0.00002.</p>
        <sec id="sec-4-1-1">
          <title>3https://huggingface.co/facebook/bart-large-cnn</title>
          <p>4https://huggingface.co/t5-base
5https://huggingface.co/google/long-t5-tglobal-base
6https://huggingface.co/allenai/led-base-16384
7https://huggingface.co/google/pegasus-pubmed</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results and Discussion</title>
      <p>In this study, we explore various transformer-based models in highlighting generation from scientific
articles. To evaluate the efectiveness of models, we used four widely used evaluation metrics such as
ROUGE-1, ROUGE-2, ROUGE-L, and METEOR. The evaluation was carried out on the masked test set
(1,840 instances) of the MixSub-SciHigh corpus. Table 3 presents the performance of the transformer
models in the generation of highlights. The efectiveness of diferent transformer models is presented
graphically in Figures 1a, 1b, 1c and 1d.</p>
      <p>From the evaluation results, we found that BART provides the best performance and outperforms
other models in both ROUGE and METEOR scores. Its pre-training denoising autoencoder, which
combines a bidirectional encoder and an autoregressive decoder, contributed to its strong balance of
lfuency and content coverage. The feature of the BART model is well suited for highlight generation.
The efectiveness of each model is varied due to diferences in architecture and pre-training objectives.
BART achieves comparatively stronger results because its denoising autoencoder objective is well-suited
to compressing important information into sharp and short highlights. T5 performs slightly lower, as
its general text-to-text formulation is not specifically optimised for abstractive summarisation. The T5
model produced coherent and stylistically fluent outputs. However, the model generates less precise
highlights.</p>
      <p>The LongT5 architecture has the capability to handle long input sequences due to its extended
attention mechanisms. However, these capabilities are not fully utilized in relatively short scientific
abstracts, resulting in no notable progress. Moreover, LED is good at handling sparse attention and
progressive memory structure that can handle thousands of tokens, but in this task, the short context
length does not make the most of its long-range advantages. Both LongT5 and LED were designed to
handle much longer input texts, but their advantages could not be fully utilized here due to the shorter
abstract lengths. This suggests that long-context architectures may not ofer substantial benefits when
applied to relatively brief scientific abstracts.</p>
      <p>PEGASUS ofer competitive performance to other explored models. However, domain mismatch
leads to inconsistent and overly descriptive outputs rather than concise highlights. We found that the
(a) ROUGE-1 scores of diferent
transformerbased models
(b) ROUGE-2 scores of diferent
transformerbased models
(c) ROUGE-L scores of diferent
transformerbased models
(d) METEOR scores of diferent
transformerbased models
efectiveness of the model is influenced by the pre-training strategy, context handling, and compression
ability. It is interesting to note that PEGASUS was trained in biomedical corpora, demonstrating a
lack of consistency when applied outside of its target medical domain. Its outputs were often overly
descriptive and did not capture concise highlights.</p>
      <p>An example of the author-written research highlights and the highlights generated by diferent
transformer-based models is shown in Table 4. The author-written research highlights ofer concise
and coherent summaries of abstracts. Among the transformer models, BART and T5 produce relatively
lfuent outputs that preserve key semantic elements. However, BART aligns more closely with the
reference highlight. LED and LongT5 attempt to capture technical details, but sometimes generate
redundancies or incomplete phrasing. PEGASUS usually focuses on specific details when creating
meaningful content. This leads to an incomplete representation of the intended highlight. We found that
diferent transformer architectures are able to balance fluency, content preservation, and conciseness
when generating highlights in diferent ways.</p>
      <p>According to the shared task submission policy, we can submit upto two models, and the remaining
models we evaluated ourselves. So, we submitted the best performing models BART and T5. Among
the models evaluated, BART-large-CNN ofers the best performance. The results demonstrate that
combining the generative power of large pre-trained transformers with scientific domain enrichment
yields highlights that are concise, factually accurate, and stylistically consistent with those written
by authors. Table 5 shows the leader board results of the FIRE 2025 SciHigh shared task. Our team,
SVNIT_CSE, achieved a ROUGE-L score of 0.2302, securing the third position on the leaderboard. The
explored model improves efectiveness and ofers competitive performance to other evaluation models
in the shared task.</p>
      <p>Irritability is a transdiagnostic feature of a wide range of psychiatric disorders. Irritability
reflects extreme expression of temperamental negative emotionality. Neural substrates of
irritability include low midbrain dopamine activity reactivity. Irritability can be expressed
as both a mood state and an emotion.</p>
      <p>Irritability has transdiagnostic associations with diverse forms of psychopathology.
Irritability derives from low tonic dopamine levels and low phasic dopamine reactivity in subcortical
neural structures implicated in appetitive responding. Diferent findings often emerge in
neuroimaging studies when irritability is assessed in
We present a model in which irritability derives from low tonic dopamine levels and low
phasic DA reactivity in subcortical neural structures implicated in appetitive responding.
We propose that more authors use hierarchical Bayesian modeling which capture
We describe a model in which irritability derives from low tonic dopamine levels and low
phasic DA reactivity in subcortical neural structures implicated in appetitive responding.
We propose that more authors use hierarchical Bayesian modeling which capture
I irritability has transdiagnostic associations with diverse forms of psychopathology. In
contrast to other emotional states and traits however literature addressing associations
between irritability and related temperament and personality constructs is limited. We
situate irritability in literatures on child temperament and adult personality and describe a
model in which irritability derives
The of irritability in the literatures on child temperament and adult personality is described.
A model suggests that irritability derives from low tonic dopamine levels and low phasic
reactivity in subcortical neural structures implicated in appetitive responding. We suggest
that more authors use hierarchical modeling which captures functional dependencies
between irritability and other</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>The generation of scientific highlight is an important downstream task in the natural language processing
domain. In this study, we implemented several transformer-based models, including BART, T5, LongT5,
LED, and PEGASUS for highlight generation. Among them, BART outperformed the other models
Group Name
Text_highlights_gen
AiNauts
SVNIT_CSE
NLPFusion
The NLP Explorers
NIT_PATNA_2025
MUCS
JU_CSE_PR_KS
SCaLAR
Ayanika
Shilpo
TJP
run1
run1
run1
run2
run2
run1
run1
run1
run1
run1
run1
run1
in the ROUGE and METEOR criteria. The highlight generated by the BART model is quite similar
to the user-generated high light. The PEGASUS model has poor efectiveness. Despite these results
being encouraging, several challenges remain. The generated highlights are sometimes redundant
or not accurate, and current evaluation metrics were not always aligned with human judgments of
quality and usefulness. In the future, we can explore the integration of domain-adapted embeddings
or retrieval-augmented strategies in high light generation. Additionally, combining automatic metrics
with human evaluation will provide a more comprehensive assessment of highlight quality.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>We sincerely thank the organizers of FIRE 2025 for providing the opportunity to host the SciHigh track
as part of the conference. We sincerely acknowledge the creators of the MixSub corpus for compiling
and curating this high-quality dataset from ScienceDirect research articles. Their efort in assembling
and organizing large-scale scientific content has been instrumental in enabling this shared task and
advancing research in scientific text summarization.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the author(s) used ChatGPT in order to: Grammar and spelling
check. After using these tool(s)/service(s), the author(s) reviewed and edited the content as needed and
take(s) full responsibility for the publication’s content.
[11] A. M. Rush, S. Chopra, J. Weston, A neural attention model for abstractive sentence summarization,
in: Proc. of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP
2015, Lisbon, Portugal, The Association for Computational Linguistics, 2015, pp. 379–389.
[12] S. Chopra, M. Auli, A. M. Rush, Abstractive sentence summarization with attentive recurrent
neural networks, in: North American Chapter of the Association for Computational Linguistics,
2016.
[13] D. T. Anh, N. T. T. Trang, Abstractive text summarization using pointer-generator networks with
pre-trained word embedding, in: Proceedings of the 10th International Symposium on Information
and Communication Technology, 2019, pp. 473–478.
[14] T. Rehman, S. Chattopadhyay, D. K. Sanyal, Abstractive summarization of scientific documents:
Models and evaluation techniques, in: Proceedings of the 15th Annual Meeting of the Forum for
Information Retrieval Evaluation, FIRE, ACM, 2023, pp. 121–124.
[15] R. Nallapati, F. Zhai, B. Zhou, Summarunner: A recurrent neural network based sequence model
for extractive summarization of documents, in: Proceedings of the Thirty-First AAAI Conference
on Artificial Intelligence, 2017, pp. 3075–3081.
[16] C. M. de Souza, M. R. G. Meireles, R. Vimieiro, A multi-view extractive text summarization
approach for long scientific articles, in: International Joint Conference on Neural Networks,
IJCNN, IEEE, 2022, pp. 1–8.
[17] E. Collins, I. Augenstein, S. Riedel, A supervised approach to extractive summarisation of scientific
papers, in: Conference on Computational Natural Language Learning, 2017, pp. 195–205. Dataset
available at GitHub repository as per paper.
[18] S. P. a. T.Rehman, D.K.Sanyal, Automatic generation of research highlights from scientific abstracts,
in: Proc. 2nd Workshop Extraction Eval. Knowl. Entities Sci. Documents (EEKE), JCDL, CEUR,
Workshop„ 2021, pp. 69–70.
[19] P. M. T. Rehman, D. K. Sanyal, S. Chattopadhyay, Named entity recognition based automatic
generation of research highlights, 2022, pp. 163–169.
[20] T. Rehman, D. K. Sanyal, S. Chattopadhyay, Research highlight generation with elmo contextual
embeddings, Scalable Comput. Pract. Exp. 24 (2023) 181–190.
[21] A. Vaswani, N. M. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin,</p>
      <p>Attention is all you need, in: Neural Information Processing Systems, 2017, pp. 5998–6008.
[22] Y. Liu, M. Lapata, Text summarization with pretrained encoders, in: Proceedings of the 2019
Conference on Empirical Methods in Natural Language Processing and the 9th International Joint
Conference on Natural Language Processing,ACL, 2019, pp. 3728–3738.
[23] N. G. M. G. A. M. O. L. V. S. M. Lewis, Y. Liu, L. Zettlemoyer, Bart: Denoising sequence-to-sequence
pre-training for natural language generation, translation, and comprehension, in: Annual Meeting
of the Association for Computational Linguistics, 2020, pp. 7871–7880.
[24] A. R. K. L. S. N. M. M. Y. Z. W. L. C. Rafel, N. Shazeer, P. J. Liu, Exploring the limits of transfer
learning with a unified text-to-text transformer, volume 21, 2020, pp. 5485–5551.
[25] J. Zhang, Y. Zhao, M. Saleh, P. J. Liu, Pegasus: Pre-training with extracted gap-sentences for
abstractive summarization, in: Proceedings of the 37th International Conference on Machine
Learning, volume 119, 2020, pp. 11328–11339.
[26] I. Beltagy, K. Lo, A. Cohan, Scibert: A pretrained language model for scientific text, in: Conference
on Empirical Methods in Natural Language Processing, ACL, 2019, pp. 3613–3618.
[27] M. Guo, J. Ainslie, D. C. Uthus, S. Ontañón, J. Ni, Y.-H. Sung, Y. Yang, Longt5: Eficient text-to-text
transformer for long sequences, in: Findings of the Association for Computational Linguistics:
NAACL, Association for Computational Linguistics, 2022, pp. 724–736.
[28] I. Beltagy, M. E. Peters, A. Cohan, Longformer: The long-document transformer, CoRR
abs/2004.05150 (2020).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>L.</given-names>
            <surname>Bornmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Haunschild</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mutz</surname>
          </string-name>
          ,
          <article-title>Growth rates of modern science: a latent piecewise growth curve approach to model publication numbers from established and new literature databases</article-title>
          ,
          <source>Humanities and Social Sciences Communications</source>
          <volume>8</volume>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>15</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>T.</given-names>
            <surname>Rehman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. K.</given-names>
            <surname>Sanyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chattopadhyay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Bhowmick</surname>
          </string-name>
          ,
          <string-name>
            <surname>P. P. Das</surname>
          </string-name>
          ,
          <article-title>Generation of highlights from research papers using pointer-generator networks and scibert embeddings</article-title>
          ,
          <source>IEEE Access 11</source>
          (
          <year>2023</year>
          )
          <fpage>91358</fpage>
          -
          <lpage>91374</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E.</given-names>
            <surname>Collins</surname>
          </string-name>
          , I. Augenstein,
          <string-name>
            <given-names>S.</given-names>
            <surname>Riedel</surname>
          </string-name>
          ,
          <article-title>A supervised approach to extractive summarisation of scientific papers</article-title>
          ,
          <source>in: Conference on Computational Natural Language Learning</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>195</fpage>
          -
          <lpage>205</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>L.</given-names>
            <surname>Cagliero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Quatra</surname>
          </string-name>
          ,
          <article-title>Extracting highlights of scientific articles: A supervised summarization approach</article-title>
          ,
          <source>Expert Syst. Appl</source>
          .
          <volume>160</volume>
          (
          <year>2020</year>
          )
          <fpage>113659</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>See</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. J.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          ,
          <article-title>Get to the point: Summarization with pointer-generator networks, in: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics</article-title>
          ,
          <string-name>
            <surname>ACL</surname>
          </string-name>
          <year>2017</year>
          ,
          <year>2017</year>
          , pp.
          <fpage>1073</fpage>
          -
          <lpage>1083</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>H. P.</given-names>
            <surname>Luhn</surname>
          </string-name>
          ,
          <article-title>The automatic creation of literature abstracts</article-title>
          ,
          <source>IBM Journal of Research and Development</source>
          <volume>2</volume>
          (
          <year>1958</year>
          )
          <fpage>159</fpage>
          -
          <lpage>165</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>T.</given-names>
            <surname>Rehman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Das</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. K.</given-names>
            <surname>Sanyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chattopadhyay</surname>
          </string-name>
          ,
          <article-title>An analysis of abstractive text summarization using pre-trained models</article-title>
          ,
          <source>CoRR</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>T.</given-names>
            <surname>Hasan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bhattacharjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Islam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. S.</given-names>
            <surname>Mubasshir</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Rahman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Shahriyar</surname>
          </string-name>
          , Xl-sum:
          <article-title>Large-scale multilingual abstractive summarization for 44 languages, in: Findings of the Association for Computational Linguistics</article-title>
          : ACL,
          <year>2021</year>
          , pp.
          <fpage>4693</fpage>
          -
          <lpage>4703</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>C.</given-names>
            <surname>Palen-Michel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lignos</surname>
          </string-name>
          ,
          <article-title>Lr-sum: Summarization for less-resourced languages, in: Findings of the Association for Computational Linguistics: ACL, Association for Computational Linguistics</article-title>
          ,
          <year>2023</year>
          , pp.
          <fpage>6829</fpage>
          -
          <lpage>6844</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>R.</given-names>
            <surname>Nallapati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <surname>C. N. dos Santos</surname>
          </string-name>
          , Ç. Gülçehre,
          <string-name>
            <given-names>B.</given-names>
            <surname>Xiang</surname>
          </string-name>
          ,
          <article-title>Abstractive text summarization using sequence-to-sequence rnns and beyond</article-title>
          ,
          <source>in: Proc. 20th SIGNLL Conference on Computational Natural Language Learning</source>
          , CoNLL, Association for Computational Linguistics,
          <year>2016</year>
          , pp.
          <fpage>280</fpage>
          -
          <lpage>290</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>