<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Summarizing English News Articles: Leveraging T5 and Google Gemini 1.0 Pro</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>M Saipranav</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Murari Sreekumar</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Durairaj Thenmozhi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shreyas Karthik</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rahul VS</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Sri Sivasubramaniya Nadar College Of Engineering</institution>
          ,
          <addr-line>Rajiv Gandhi Salai (OMR), Kalavakkam, 603 110, Tamil Nadu</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Text summarization presents a considerable challenge in the field of Natural Language Generation (NLG) due to its inherent context dependent nature. Despite growing research in this area, text summarization for Indian languages has received limited attention. The ILSUM 2024 shared task aims to address this gap by using machine learning approaches in order to generate meaningful fixed-length summaries, either extractive or abstractive, of articles in multiple Indian languages. Our team focused on the English dataset and employed machine learning approaches such as T5 and Gemini 1.0 Pro to generate the summary. In terms of BERT scores, the Gemini model outperformed T5, achieving a BERT F1-score of 0.8675 compared to T5's 0.8489. However, T5 excelled in ROUGE metrics, with a ROUGE-L score of 0.2644 versus Gemini's 0.233. Our models achieved an overall rank of 6, based on their highest ROUGE and BERT scores.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Text Summarization</kwd>
        <kwd>Machine Learning Algorithms</kwd>
        <kwd>Natural Language Processing</kwd>
        <kwd>Natural Language Generation</kwd>
        <kwd>T5 Model</kwd>
        <kwd>Google Gemini 1</kwd>
        <kwd>0 Pro</kwd>
        <kwd>Text Analytics</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>We participated in Task 1: Language Summarization for Indian Languages, aiming to generate
a meaningful, fixed length summaries for English Language texts from various given news articles
consisting of a headline-news article pair.</p>
      <p>We used 2 pre-trained models such as the Transformer-based T5 (Text-to-Text Transfer Transformer)
and Google Gemini 1.0 Pro, which make use of huge datasets to learn linguistic patterns that enhance
their summary quality. Using these methods, we overcome many challenges related to summarization,
successfully gaining the confidence to apply it to various languages.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Works</title>
      <p>Numerous eforts have been dedicated to advancing text summarization techniques across a range of
Indian languages. Significant eforts have been made by researchers around the world to summarize
text by applying machine learning models such as T5, BART and Pegasus. In addition to these, various
other extractive and abstractive methods have been used that consistently provide excellent accuracy
in summarizing texts.</p>
      <p>
        Research [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] delves into the text summarization of English and Hindi. For the summarization of
English dataset, the text is first preprocessed after which T5-base and T5-small models were applied.
For regional languages, the texts were first preprocessed, translated to English and after summarizing
with T5-base, it was translated back to the original regional languages.
      </p>
      <p>
        Another research [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] attempts to summarize news in English, Hindi and Gujarati by using the data
built using article and headline pairs scraped from various external websites. For Hindi and Gujarati,
multilingual models such as MT5, MBart and IndicBART variants were used and PEGASUS, BART, T5
and ProphetNet models were fine-tuned and used on English data
      </p>
      <p>
        Dhaval Taunk et.al [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] performed text summarization using multilingual models like mBART, mT5
and IndicBART. After fine-tuning the models on the datasets, they had also performed data augmentation
in two ways - by augmenting the 3X data to the actual dataset and another by appending 5X data to
the actual dataset which gave a significant improvement in the results. HuggingFace API and PyTorch
were also used to fine-tune the models for 5, 7, 10 epochs for diferent models.
      </p>
      <p>
        Research [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] implemented deep learning based approaches for article summarization in Indian
Languages. SoTA Pegasus model was fine-tuned and used as it worked best for English dataset
and IndicBART model along with data augmentation was used for Hindi dataset. Apart from these,
BRIO(Bringing Order to Abstractive Summarization), SentenceBERT and T5 were used for English and
XL-Sum, mBART and Translation+Mapping+PEGASUS were used for Hindi and Gujarati.
      </p>
      <p>
        An extractive approach for automated summarization of Indian languages was used in research [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
This research initially involves reduction of a text document to generate a new form which conveys the
key meaning of the information contained in the text. Then the splitting and vectorization, word and
sentence vectorization techniques were applied on the dataset after which K-means Clustering was
used. K-means further comple ments the eficiency of extractive techniques as it is quick and suitable
for small as well as large samples. Fine tuning the parameters may prove to be very fruitful.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Dataset Description</title>
      <p>The dataset for this task (Table:1) contains more than 13,000 news articles in each language, drawn from
the leading newspapers of the country. The objective of this task is to generate meaningful fixed-length
summaries, be it extractive or abstractive, for each article.</p>
      <p>A unique feature of the dataset is that the Indian language datasets exhibit code-mixing in both the
headlines as well as the articles, where English is mixed with the Indian language content.</p>
      <p>The dataset consists of CSV files in languages such as Hindi, Gujarati, English, Tamil, Kannada, Telugu,
and Bengali, where each language has 3 CSV files containing the training, testing, and validation data,
respectively. The training and validation data contain columns for headings, articles, and summaries,
providing a foundation for generating meaningful summaries for the given articles.</p>
      <sec id="sec-3-1">
        <title>3.1. Task Description</title>
        <p>Generate concise and meaningful fixed-length summaries for news articles in multiple Indian languages,
considering the challenge of code-mixing and script mixing. The dataset includes articles and headline
pairs from leading newspapers in English, Hindi, Gujarati, and Bengali,etc. We have performed this
task by using the English dataset.</p>
        <p>Task
Training
Validation</p>
        <p>Testing</p>
        <p>Language</p>
        <p>English
English
English</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Approach</title>
      <p>We fine tuned the machine learning models: Gemini 1.0 pro and T5- Base using the training dataset,
evaluated the models on the dev dataset and submitted our runs by applying the ML models on the test
dataset.</p>
      <sec id="sec-4-1">
        <title>4.1. Data Preparation</title>
        <p>Our first step was to clean the data given (Figure: 1) in order to improve the performance of the machine
learning models:
1. Converting the text to lowercase: This ensures consistency in text data. By doing this the
vocabulary size is reduced and it reduces the computational requirements.
2. Removing punctuation marks:They often point to external resources that are not relevant to the
context of the text being analyzed.
3. Removing http links and emoticons:These do not contribute to the semantic meaning of the text.
4. Removing twitter mentions like @username
5. Removing HTML entities and tags : they often make the model find it dificult to understand the
text which is present
6. Replaces specific Unicode characters (e.g., non-breaking spaces, dashes, quotes) with their
appropriate readable forms.also removes non ASCII characters from the text.</p>
        <p>A sample of our results are shown in Table 2
Before Preprocessing</p>
        <p>After Preprocessing</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Methodology</title>
        <p>Chandu Champion OTT Release: Here’s How You Can Watch
Kartik Aaryan’s Film Online
chandu champion ott release heres how you can watch
kartik aaryans film online</p>
        <sec id="sec-4-2-1">
          <title>We have used 2 models to evaluate the dataset. They are:</title>
          <p>
            4.2.1. Google Gemini 1.0 Pro:
Gemini 1.0 Pro [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ] is a powerful large language model (LLM) designed for a variety of natural language
tasks, including text summarization. It is a large language model (LLM) that is based on the Transformer
architecture. This architecture is a type of neural network that is specifically designed for processing
sequential data, such as text.
          </p>
          <p>Gemini 1.0 Pro also uses a number of other techniques, such as pre-training on large datasets of
text and fine-tuning on specific tasks. These techniques help to improve the model’s accuracy and
performance. Gemini 1.0 Pro generates text summarization by leveraging its deep understanding of
language and context. It employs a combination of techniques, including:</p>
          <p>Sequence-to-Sequence Modeling: The model treats summarization as a sequence-to-sequence task,
where it takes a long input sequence (the original text) and generates a shorter output sequence (the
summary). This approach allows the model to capture the overall context and structure of the original
text.</p>
          <p>Attention Mechanism: The attention mechanism enables the model to focus on the most relevant
parts of the input text while generating the summary. By assigning weights to diferent words or
phrases, the model can prioritize the most important information and exclude less relevant details.</p>
          <p>Encoder-Decoder Architecture: The model consists of an encoder that processes the input text and a
decoder that generates the summary. The encoder transforms the input text into a sequence of hidden
representations, while the decoder uses these representations and the attention mechanism to generate
the summary.</p>
          <p>Pre-training on Large Datasets: Gemini 1.0 Pro is pre-trained on massive amounts of text data,
allowing it to learn the nuances of language and develop a strong understanding of context. This
pre-training helps the model generate more accurate and informative summaries.</p>
          <p>
            Fine-tuning for Specific Tasks: For summarization tasks, the model can be further fine-tuned on
specific datasets to improve its performance. This involves training the model on a large number
of examples of input texts and their corresponding summaries, allowing it to learn the patterns and
characteristics of efective summarization. The architecture diagram of Gemini 1.0 pro model is shown
in Figure:2
4.2.2. T5 Text-to-Text Transfer Transformer
In this study, we have employed the T5 - Text to Text Transfer Transformer model developed by Google
[
            <xref ref-type="bibr" rid="ref14">14</xref>
            ], for our text summarization task. This model features an encoder-decoder structure, specifically
designed for the text-to-text approach, and is trained on a 750 GB dataset, the Colossal Clean Crawled
Corpus (C4)[15]. The authors of the T5 paper (Rafel et al.)[
            <xref ref-type="bibr" rid="ref14">14</xref>
            ] achieved optimal performance by
training the model for 1 million steps, utilizing a batch size of 2 to the power 11 sequences with a
maximum length of 512 tokens, ensuring comprehensive learning across diverse tasks.
          </p>
          <p>In T5, every task is converted to a text-text format, enabling the model to address any NLP task
without requiring adjustments to the hyper parameters or loss functions. The model manages a variety
of tasks by prepending a diferent prefix to the input corresponding to each task, e.g., for summarization:
summarize {text} or for translation: translate {text} to {language} etc. It incorporates both supervised
and self-supervised training methods, leveraging benchmarks like GLUE and SuperGLUE while also
employing an approach to self-supervised training using corrupted tokens.</p>
          <p>Additionally, T5 is "unified" in the sense that it can carry out multiple NLG tasks at once. Unlike
other transformers like BERT and GPT2, it does not require distinct output layers for various tasks. The
output can be a string of numbers, even for regression tasks.</p>
          <p>Owing to the above reasons, the T5 model has performed better than several other contemporary
SOTA architectures in diferent NLG tasks. The architecture of the T5 model is shown in Figure: 3.</p>
          <p>
            The T5 model’s architecture features an encoder-decoder framework built from stacked identical
layers. A position-wise feed-forward network and a multi-head self-attention mechanism make up
each encoder layer’s two primary parts. Layer normalization and residual connections are applied
after each sub-layer. This structure is mirrored in the decoder, which also has a third sub-layer that
applies multi-head attention to the encoder’s outputs. The model performs exceptionally well on a
range of natural language processing tasks thanks to this design, which also makes information flow
more eficient [
            <xref ref-type="bibr" rid="ref14">14</xref>
            ].
          </p>
          <p>T5 For Text Summarisation: T5 comes in diferent sizes, such as google-t5/t5-small,
google-t5/t5base, google-t5/t5-large, google-t5/t5-3b and google-t5/t5-11b. We chose the google-t5/t5-base variant,
which has 220 million parameters and efectively strikes a balance between model performance and
computing eficiency. The model and tokenizer were initialized using HuggingFace’s T5ForConditionalGeneration
and T5Tokenizer, and the model was shifted to a CUDA-enabled GPU, to expedite processing time. The
computational hardware used for this task is mentioned in Table 3</p>
          <p>System
Configuration:
OS/Software</p>
          <p>Boston Supermicro 4U: Intel Skylake Processor. 128 GB DDR4
RAM, 6TB SATA HDD, 480GB SATA SSD, 4*Nvidia GeForce RTX</p>
          <p>2080 11GB Graphics Cards</p>
          <p>Ubuntu 18.04 LTS Server / Jupyter Notebook</p>
          <p>Summarization was conducted in batches to optimize memory and computational throughput. Each
input text was formatted as "summarize the following article, and convey the meaning briefly but
exhaustively: {heading} {article}", allowing the model to focus on the heading to guide the summarization
process.</p>
          <p>Summaries were constrained to a maximum of 75 words and a minimum of 50 words, to maintain
conciseness and prevent impacts on ROUGE precision and recall scores while evaluating against the
ground truth summary.</p>
          <p>To enhance the output quality, beam search (num_beams=15) was employed, evaluating multiple
potential paths, while a length penalty of 1.3 discouraged verbosity, resulting in more fluent summaries.
Additionally, repeated phrases were minimized using a No Repeat N-gram Size set to 5, improving
readability and coherence.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results and Performance Analysis</title>
      <p>To evaluate the efectiveness of our models, we used the suite of BERT and ROGUE scores against our
T5 and Gemini 1.0 Pro runs.</p>
      <p>ROUGE (Recall-Oriented Understudy for Gisting Evaluation) is a widely-used set of metrics for
evaluating automatic text summarization and machine-generated content against reference summaries.
The most common types of ROUGE include ROUGE-N, ROUGE-L, and ROUGE-W. ROUGE-N measures
the n-gram overlap, where n can be 1 for unigrams or 2 for bigrams, quantifying how well individual
words or short phrases match. ROUGE-L assesses the longest common subsequence, providing insights
into the syntactic structure alignment, while ROUGE-W weighs consecutive matches higher, favoring
coherent text flow. BERTScore, on the other hand, leverages contextual embeddings from BERT
(Bidirectional Encoder Representations from Transformers) to capture semantic similarity by comparing
embeddings between reference and candidate sentences. This approach is beneficial as it goes beyond
surface-level word matching, taking into account the context and meaning, which can be critical for
complex text evaluation.</p>
      <p>In terms of BERT scores, the Gemini model outperformed T5, achieving a BERT F1-score of 0.8675
compared to T5’s 0.8489. However, T5 excelled in ROUGE metrics, with a ROUGE-L score of 0.2644
versus Gemini’s 0.233. Our models achieved an overall rank of 6, based on their highest ROUGE and
BERT scores.</p>
      <p>The below tables (4 and 5 ) summarize the results of our team IdlyVadaSambar as measured by FIRE
for the aforementioned approaches against what other teams obtained in the same task.
Rouge-2
0.2060
0.1879
0.1702
0.1627
0.1589
0.1554
0.1448
0.1293
0.1392
0.0944
0.0894
0.0579
0.0437</p>
      <p>Rouge-4
0.1467
0.1357
0.1036
0.0992
0.1011
0.0856
0.0843
0.0664
0.0927
0.02
0.0513
0.0111
0.0123</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>In this paper, we explored English text summarization using advanced NLP/NLG models T5 and Google
Gemini 1.0 Pro, and assessed their generated summaries’ quality against evaluation metrics such as
ROUGE and BERTScore. Our findings indicate that these evaluation techniques validate both the
coherence and informativeness of summaries, highlighting the robustness of text summarization models
when measured against established benchmarks.</p>
      <p>Google Gemini 1.0 Pro gave us the highest BERT score with BertScore-Precision at 0.8529,
BertScoreRecall at 0.8829 and BertScore-F1 at 0.8675. T5 model produced the best ROUGE scores with ROUGE-1
at 0.3102, ROUGE-2 at 0.1554, ROUGE-4 at 0.0856 and ROUGE-L at 0.2644.</p>
      <p>In conclusion, advancing NLP through this summarization technique ofers potential for multilingual
support and fine-tuning on diverse datasets, addressing the needs of various industries. Streamlining
text summarization into real-world applications could make it a valuable tool for managing the rapidly
expanding volume of information. By scaling this technology, we can further NLP’s capabilities, enabling
more systems to process, understand, and condense information across languages and domains.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <sec id="sec-7-1">
        <title>The author(s) have not employed any Generative AI tools.</title>
        <p>[15] TensorFlow, C4 dataset - tensorflow datasets, 2024. URL: https://www.tensorflow.org/datasets/
catalog/c4, accessed on: 26-10-2024.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mehta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hegde</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. HL</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ganguly</surname>
          </string-name>
          ,
          <article-title>Key insights from the third ilsum track at fire 2024, in: Proceedings of the 16th Annual Meeting of the Forum for Information Retrieval Evaluation</article-title>
          ,
          <string-name>
            <surname>FIRE</surname>
          </string-name>
          <year>2024</year>
          , Gandhiinagar,
          <source>India. December 12-15</source>
          ,
          <year>2024</year>
          , ACM,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mehta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hegde</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. HL</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ganguly</surname>
          </string-name>
          ,
          <article-title>Overview of the third shared task on indian language summarization</article-title>
          (ilsum
          <year>2024</year>
          ), in: K. Ghosh,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          , D. Ganguly (Eds.), Working Notes of FIRE 2024 -
          <article-title>Forum for Information Retrieval Evaluation, Gandhinagar, India</article-title>
          .
          <source>December 12-15</source>
          ,
          <year>2024</year>
          , CEUR Workshop Proceedings, CEUR-WS.org,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mehta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ganguly</surname>
          </string-name>
          ,
          <source>Indian language summarization at FIRE</source>
          <year>2023</year>
          , in: D.
          <string-name>
            <surname>Ganguly</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Majumdar</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Mitra</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Gupta</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Gangopadhyay</surname>
          </string-name>
          , P. Majumder (Eds.),
          <source>Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation</source>
          ,
          <string-name>
            <surname>FIRE</surname>
          </string-name>
          <year>2023</year>
          , Panjim, India,
          <source>December 15-18</source>
          ,
          <year>2023</year>
          , ACM,
          <year>2023</year>
          , pp.
          <fpage>27</fpage>
          -
          <lpage>29</lpage>
          . URL: https://doi.org/10.1145/3632754.3634662. doi:
          <volume>10</volume>
          .1145/3632754.3634662.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mehta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ganguly</surname>
          </string-name>
          ,
          <article-title>Key takeaways from the second shared task on indian language summarization (ILSUM 2023)</article-title>
          , in: K. Ghosh,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          , M. Mitra (Eds.), Working Notes of FIRE 2023 -
          <article-title>Forum for Information Retrieval Evaluation (FIRE-WN</article-title>
          <year>2023</year>
          ), Goa, India,
          <source>December 15-18</source>
          ,
          <year>2023</year>
          , volume
          <volume>3681</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>724</fpage>
          -
          <lpage>733</lpage>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>3681</volume>
          /
          <fpage>T8</fpage>
          -1.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          , P. Mehta,
          <article-title>FIRE 2022 ILSUM track: Indian language summarization</article-title>
          , in: D.
          <string-name>
            <surname>Ganguly</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Gangopadhyay</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Mitra</surname>
          </string-name>
          , P. Majumder (Eds.),
          <source>Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation</source>
          ,
          <string-name>
            <surname>FIRE</surname>
          </string-name>
          <year>2022</year>
          , Kolkata, India, December 9-
          <issue>13</issue>
          ,
          <year>2022</year>
          , ACM,
          <year>2022</year>
          , pp.
          <fpage>8</fpage>
          -
          <lpage>11</lpage>
          . URL: https://doi.org/10.1145/3574318.3574328. doi:
          <volume>10</volume>
          .1145/ 3574318.3574328.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mehta</surname>
          </string-name>
          ,
          <article-title>Findings of the first shared task on indian language summarization (ILSUM): approaches challenges and the path ahead</article-title>
          , in: K. Ghosh,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          , M. Mitra (Eds.), Working Notes of FIRE 2022 -
          <article-title>Forum for Information Retrieval Evaluation, Kolkata</article-title>
          , India, December 9-
          <issue>13</issue>
          ,
          <year>2022</year>
          , volume
          <volume>3395</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>369</fpage>
          -
          <lpage>382</lpage>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>3395</volume>
          /
          <fpage>T6</fpage>
          -1.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mehta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ganguly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <article-title>Fighting fire with fire: Adversarial prompting to generate a misinformation detection dataset</article-title>
          ,
          <source>CoRR abs/2401</source>
          .04481 (
          <year>2024</year>
          ). URL: https://doi.org/10. 48550/arXiv.2401.04481. doi:
          <volume>10</volume>
          .48550/ARXIV.2401.04481. arXiv:
          <volume>2401</volume>
          .
          <fpage>04481</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>V.</given-names>
            <surname>Ilanchezhiyan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Darshan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. M.</given-names>
            <surname>Dhitshithaa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Bharathi</surname>
          </string-name>
          ,
          <article-title>Text summarization for indian languages: Finetuned transformer model application</article-title>
          .,
          <source>in: FIRE (Working Notes)</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>766</fpage>
          -
          <lpage>774</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Urlana</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Surange</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Shrivastava</surname>
          </string-name>
          ,
          <article-title>Indian language summarization using pretrained sequence-to-sequence models</article-title>
          ,
          <source>arXiv preprint arXiv:2303.14461</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>D.</given-names>
            <surname>Taunk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Varma</surname>
          </string-name>
          ,
          <article-title>Summarizing indian languages using multilingual transformers based models</article-title>
          ,
          <source>arXiv preprint arXiv:2303.16657</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>R.</given-names>
            <surname>Tangsali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Pingle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vyawahare</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Joshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <article-title>Implementing deep learning-based approaches for article summarization in indian languages</article-title>
          ,
          <source>arXiv preprint arXiv:2212.05702</source>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>K.</given-names>
            <surname>Kumari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kumari</surname>
          </string-name>
          ,
          <article-title>An extractive approach for automated summarization of indian languages using clustering techniques</article-title>
          .,
          <source>in: FIRE (Working Notes)</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>418</fpage>
          -
          <lpage>423</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>G.</given-names>
            <surname>Team</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Anil</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Borgeaud</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.-B. Alayrac</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Soricut</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Schalkwyk</surname>
            ,
            <given-names>A. M.</given-names>
          </string-name>
          <string-name>
            <surname>Dai</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Hauth</surname>
          </string-name>
          , et al.,
          <article-title>Gemini: a family of highly capable multimodal models</article-title>
          ,
          <source>arXiv preprint arXiv:2312.11805</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>C.</given-names>
            <surname>Rafel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Roberts</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Narang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Matena</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. J.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <article-title>Exploring the limits of transfer learning with a unified text-to-text transformer</article-title>
          ,
          <source>Journal of Machine Learning Research</source>
          <volume>21</volume>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>67</lpage>
          . URL: http://jmlr.org/papers/v21/
          <fpage>20</fpage>
          -
          <lpage>074</lpage>
          .html.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>