<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Identifying the Type of Sarcasm in Dravidian Languages using Deep-Learning Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ramya Sivakumar</string-name>
          <email>ramyacsemsec@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>C.Jerin Mahibha</string-name>
          <email>jerinmahibha@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>B.Monica Jenefer</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Sarcasm, ALBERT</institution>
          ,
          <addr-line>Classification, Dravidian languages, Deep learning</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Sarcasm is mainly described as a word that is a synonym for irony. However, it has a more specific meaning: It is commonly used in the context of mocking or conveying contempt. It is considered essential to detect sarcasm in any text on social media because it identifies and conveys the exact meaning of the word and the expected meaning. The Sentiment Analysis System automatically checks the polarity of content, but it doesn't take into account the impact of sarcastic statements. If the system gets it wrong, it won't work as well. So, if it can automatically recognize sarcastic statements from social network data, it can make the Sentiment Analysis System and other NLP-based apps better. In this shared task, we have used the ALBERT transformer model to detect and classify the given text as sarcasm and not sarcasm. Using this model to train and predict the data, we were able to achieve macro F1 scores of 0.48 and 0.34 for the Tamil dataset and Malayalam dataset, respectively.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        Sarcasm is a word derived from the Greek verb ”Sark’azein,” which means to speak bitterly.
These words are often used in a humorous way to mock people [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. It is very easy to find
sarcasm when one is having face-to-face communication. It can be identified either using facial
expressions or tone of speech. However, this is not the case when we are involved in textual
communication. For instance, among all the social media platforms, YouTube is one where a
majority of people tend to share content on any topic as videos. At the same time, everyone has
the access and privilege to comment on the videos that are being posted [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. In general, detecting
sarcasm from textual content itself is a more challenging task, and the freedom to comment
in any preferred language and manner on YouTube makes the task even more challenging [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
Basically, if something is said like, ”Oh yes, you’ve been so helpful; thank you so much for
your help” and then followed up with a smiley face, it’s easy to tell it’s not sarcastic. But if the
message is accompanied by an angry or frustrated look, it’s a sign that the person is trying to
      </p>
      <p>
        Amidst all the dificulties and challenges, researchers still try to come up with new ways
to detect sarcasm for various reasons, like to improve communication, because sarcasm often
CEUR
Workshop
Proceedings
leads to miscommunication as the intended meaning of that situation is diferent from the
actual meaning of the word that is used. It also helps with sentiment analysis tasks that help
MNCs and other big companies analyze their product reviews and work accordingly. With the
increasing use of social media, these detection tasks also help in improving cyber security and
wellness [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The importance of sarcasm detection also plays a major role in the development
of AI, as it can improve the performance of AI by providing more contextually relevant content.
With all these considerations the shared task on sarcasm detection has been introduced [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] as a
part of FIRE 2023.
      </p>
    </sec>
    <sec id="sec-3">
      <title>2. Related Works</title>
      <p>
        Sharma et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] has proposed a hybrid model for the detection of sarcasm in a given dataset.
This hybrid model mainly comprises three subordinates, namely: BERT, USE, and Autoencoder.
This hybrid algorithm has been implemented on SARC, Twitter, and Headlines datasets and
has achieved an average accuracy of around 90 percent. Meriem et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] had come up with a
fuzzy approach to solve the task. This approach focuses on predicting the right label based on a
measure known as the Sarcasm Score Measure, which calculates the measure of sarcasm based
on which the prediction is made. This model has been implemented on two datasets: one is
SemEval2014, and the other is the Bamman et al. dataset. This resulted in an F1-score of 75.9
and 74.8 percent, respectively. Both binary and multi-class classifications were performed in
this task. Sundararajan and Palanisamy [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] had come up with a probabilistic model that helped
in the prediction of sarcastic texts. The whole model had worked with the help of both the
probabilistic model and the CNN (convolutional neural network). The confidence level that is
obtained as an output from the probabilistic model was later fed into the CNN for the actual
prediction. Tweets collected from the Tweet API were used as the data for implementation. This
had an accuracy of 97.25 percent. Vinoth and Prabhavathy [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] had presented a model named
IMLB-SDC, which is the intelligence machine learning sarcasm detection and classification.
This proposed data model, besides the text processing and feature extraction methods, also used
the SVM (Support Vector Machine) and penalty factor to enhance its performance. Govindan
and Balakrishnan [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] had created a data model called the hyperbole-based Sarcasm detection
model (HbSD). Here, the paper examined negative sentiment tweets that contain hyperbole for
sarcasm detection tasks. This data model has been implemented on the Streaming Twitter API.
78.74% accuracy and 0.71 F1 score were achieved when the HbSD model was used. Kalaivani and
Thenmozhi [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] had performed sentiment analysis on the Dravidian-CodeMix-FIRE2021 dataset,
where comments in 3 languages were handled: Tamil, Malayalam, and Kannada. They had used
the pre-defined BERT model with the ktrain library to perform this task. The main idea was to
work with and analyze comments from YouTube. As a result, they were able to achieve macro
F1 scores of 0.47, 0.64, and 0.48, respectively. The task of humor detection had been carried out
by training the dataset with diferent transformer models like Multilingual BERT, Multilingual
DistilBERT, and XLM-RoBERTa, and all the results were compared by Bellamkonda et al. [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
Among these, XLM-RoBERTa was found to perform best with an F1-score of 0.82 and 81.5%
accuracy. The model experimentation dataset had been formed by scrapping tweets from Twitter
and filtering specific tags. Traditional machine learning models had been used to detect sarcasm
in the Ben-Sarc corpus [13]. Models used in this task include Logistic Regression, Decision Tree,
Random Forest, Multinomial Naive Bayes, K-Nearest Neighbors, Linear Support Vector Machine,
and Kernel SVM. At the end of this task, the BERT model had attained the maximum accuracy
of 75.05 percent and the second highest accuracy by the LSTM model of 72.48 percent, followed
by 72.36 percent. The use of a bidirectional dual encoder with Additive Margin Softmax to
perform ofensive language classification tasks had been proposed by Mahibha et al. [14], which
resulted in an F1 score of 0.865.
      </p>
    </sec>
    <sec id="sec-4">
      <title>3. Dataset</title>
      <p>The task of sarcasm detection was implemented based on two Dravidian languages, namely
Tamil and Malayalam. Separate datasets in code-mixed Tamil-English and Malayalam-English
were provided by the task organizers for carrying out the task of sarcasm detection. The
text in the dataset was represented in Roman and native scripts. Text and label information
were provided for each of the instances in the dataset. Text is the actual comment that was
posted on social media, and labels define the two main sub-categories in which the comments
are grouped, which are sarcastic and non-sarcastic. The training dataset is first fed to the
proposed deep learning model. The model uses the data to learn so that it can be used for the
purpose of prediction. Later, the model is fed with the validation dataset for further training.
This is commonly known as the development dataset. After the training process, the model
is fed with instances of the development data, using which it fine-tunes the parameters to
increase the accuracy of the results. The last phase of the task involves the use of a test dataset
that contains only the text for which the corresponding labels have to be predicted using the
trained model. The number of instances under each category of the diferent datasets is shown
in Table 1. The training dataset for Tamil and Malayalam languages had 27036 and 12057
samples, respectively. Similarly, the validation dataset had 6759 and 3015 instances in Tamil
and Malayalam, respectively, and the test dataset of the Malayalam language contained 3768
comments and the Tamil language contained 8449 comments, for which the labels had to be
predicted.</p>
    </sec>
    <sec id="sec-5">
      <title>4. System Description</title>
      <p>Given a dataset, text classification and prediction are implemented in a sequence of steps.
Initially, the training and validation datasets are fed into the system for pre-processing. Various
pre-processing techniques, including tokenization, stemming, lemmatization, and the removal
of stop words, are implemented, which help in gaining a more accurate result.</p>
      <p>The next process is data encoding. The transformer model accepts data in numerical format.
Hence, in order to feed the data into the model, the cleaned data is further encoded into
numerical data. These encoded data are mapped to the existing words and index values in the
model’s vocabulary.</p>
      <p>Following this, model selection is done, where the suitable version of the model is chosen to
implement the process of classification. The proposed system uses the ALBERT [ 15] model for
the process of implementation.</p>
      <p>The process of tokenizing data is carried out by the proposed model to satisfy its
requirements. Some of the main categories of classifications include segregating the text as classifiers,
separators, etc. It is important to ensure that all the tokens generated are of the same length;
padding of data needs to be done to rectify the same. Input formatting is also done on the input
data. ALBERT models accept the input data to be in the format of segment ID, followed by the
attention masks. Segment ID is responsible for diferentiating between the sentence pairs, and
attention masks indicate to the model the set of tokens that need attention. Hence, it is necessary
that the input data be in this format. Fine-tuning helps the model make predictions based on
the encoded input data, which is followed by optimization. The ALBERT model reduces loss
and optimizes the output using SOP (sentence order prediction), which reduces loss by avoiding
topic prediction. The next significant step in the process is that the model is trained using the
dataset, and evaluation is done. Now the model is trained using the development dataset, and
the model’s performance is noted. Based on the inference, changes to hyperparameters can be
made to achieve better results. As a result of the training process, the model is now made to
predict the labels for the instances of the validation dataset, and the comparison of output is
done. The architecture of the proposed model is represented by Figure 1.</p>
      <p>Finally, the test data which is the new unseen data is fed into the model and labels are
generated for the data. Compared to other BERT models we have used the ALBERT model for
classification purposes as it supports Indian languages and classifies text in an eficient way by
parameter sharing which reduces overfitting and computation is done faster. This model is also
highly scalable in nature which makes it versatile.
4.1. ALBERT
ALBERT [15] stands for ”A Lite BERT” as it is extracted from the BERT model. BERT
(Bidirectional Encoder Representations from Transformers) is a transformer model that uses
transformer encoders to process the given input data. Both BERT and ALBERT models use
the same backbone architecture represented by Figure 2.The advantages of using ALBERT
over BERT are that its computational speed is fast, and it is also stated that ALBERT performs
better even with a smaller number of parameters, unlike BERT. The number of parameters is
reduced by the parameter sharing method and the factorization of the embedding matrix. Using
this method, the embeddings generated are split into two matrices. Input-level embeddings
will have all the embeddings that will process the context-independent learning. Similarly,
high-level embeddings are responsible for context-dependent learning. ALBERT is a supervised
learning model, meaning it learns from the given input dataset and trains the model based on
its learning. Albert uses masked language models to train the data. This model makes use of
the self-supervised sentence order prediction loss to find out the inter-sentence relations in the
given input data.</p>
      <p>ALBERT is a pre-trained model, and hence performing operations is made easy using the
TensorFlow hub.</p>
    </sec>
    <sec id="sec-6">
      <title>5. Results</title>
      <p>Table 2 shows the results of the Sarcasm detection task that was carried out. Using the
proposed model, we were able to predict the labels for the comments given in the dataset. It
yielded an accuracy of 0.81 and 0.79 for the Tamil and Malayalam datasets, respectively. It
could be seen that out of 8452 comments, 1883 comments are sarcastic and 6567 comments are
non-sarcastic in the Tamil dataset. Similarly, in the Malayalam dataset, out of 3768 comments,
69 are sarcastic and 3699 are non-sarcastic. Macro-F1 scores of 0.48 and 0.34 were also achieved
in the Tamil and Malayalam datasets, respectively. The classification report obtained for Tamil
and Malayalam are represented by heatmaps in the Figure 3 and Figure 4</p>
    </sec>
    <sec id="sec-7">
      <title>6. Error Analysis</title>
      <p>While comparing the predicted labels obtained using the proposed model and the actual labels
for each instance of the dataset, it was found that there are both false positive and false negative
values. This can be further witnessed with the F1-score that is obtained during the process.</p>
      <p>The reasons for the error in the predictions could be due to the absence of any sarcastic word;
thus, it is classified as ”non-sarcastic” instead of the appropriate label of ”sarcastic”. Another
reason could be that few texts do not have words in them but rather just symbols; hence, it is
dificult to predict the correct label. Hence, such texts are classified as non-sarcastic instead of
the actual label ”sarcastic”. All the example texts demonstrate sarcasm and play a significant
role in the classification process. Sample text instances that are misclassified are represented in
tables 3 and 4</p>
    </sec>
    <sec id="sec-8">
      <title>7. Conclusion</title>
      <p>The way people communicate online is getting more and more complicated. So traditional
methods like feature-based or machine-learning-based methods won’t work if you’re trying to
detect sarcasm. It’s important to diferentiate between sarcastic and non-sarcastic text when it
comes to online content. Trying to detect sarcasm by looking at things like language, sentiment,
and syntax can give people the wrong idea. Context and semantic information are key when
it comes to spotting sarcasm. We want to make our work better in the future by using a
bigger dataset for training. Plus, emojis and emoticons are really important for showing what a
comment means on social media, so we’ll think about adding them to the text.
[13] S. K. Lora, G. Shahariar, T. Nazmin, N. N. Rahman, R. Rahman, M. Bhuiyan, et al., Ben-sarc:
A corpus for sarcasm detection from bengali social media comments and its baseline
evaluation (2022).
[14] J. Mahibha, S. Kayalvizhi, D. Thenmozhi, Ofensive language identification using machine
learning and deep learning techniques (2021).
[15] Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, R. Soricut, Albert: A lite bert for
self-supervised learning of language representations, arXiv preprint arXiv:1909.11942
(2019).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Reyes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Veale</surname>
          </string-name>
          ,
          <article-title>A multidimensional approach for detecting irony in twitter, Language resources</article-title>
          and evaluation
          <volume>47</volume>
          (
          <year>2013</year>
          )
          <fpage>239</fpage>
          -
          <lpage>268</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Birjali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kasri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Beni-Hssane</surname>
          </string-name>
          ,
          <article-title>A comprehensive survey on sentiment analysis: Approaches, challenges and trends</article-title>
          ,
          <source>Knowledge-Based Systems</source>
          <volume>226</volume>
          (
          <year>2021</year>
          )
          <fpage>107134</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>C. J.</given-names>
            <surname>Mahibha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kayalvizhi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Thenmozhi</surname>
          </string-name>
          ,
          <article-title>Sentiment analysis using cross lingual word embedding model (</article-title>
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>T. P.</given-names>
            <surname>Nagarhalli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vaze</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Rana</surname>
          </string-name>
          ,
          <article-title>Impact of machine learning in natural language processing: A review, in: 2021 third international conference on intelligent communication technologies and virtual mobile networks (ICICV)</article-title>
          , IEEE,
          <year>2021</year>
          , pp.
          <fpage>1529</fpage>
          -
          <lpage>1534</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Sripriya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Bharathi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Nandhini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. Chinnaudayar</given-names>
            <surname>Navaneethakrishnan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Durairaj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ponnusamy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Kumaresan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. K.</given-names>
            <surname>Ponnusamy</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Rajkumar, Overview of the shared task on sarcasm identification of Dravidian languages (Malayalam and Tamil) in DravidianCodeMix, in: Forum of Information Retrieval and Evaluation FIRE -</article-title>
          <year>2023</year>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D. K.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <article-title>Sarcasm detection over social media platforms using hybrid auto-encoder-based model</article-title>
          ,
          <source>Electronics</source>
          <volume>11</volume>
          (
          <year>2022</year>
          )
          <fpage>2844</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A. B.</given-names>
            <surname>Meriem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Hlaoua</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. B.</given-names>
            <surname>Romdhane</surname>
          </string-name>
          ,
          <article-title>A fuzzy approach for sarcasm detection in social networks</article-title>
          ,
          <source>Procedia Computer Science</source>
          <volume>192</volume>
          (
          <year>2021</year>
          )
          <fpage>602</fpage>
          -
          <lpage>611</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>K.</given-names>
            <surname>Sundararajan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Palanisamy</surname>
          </string-name>
          ,
          <article-title>Probabilistic model based context augmented deep learning approach for sarcasm detection in social media</article-title>
          ,
          <source>Int. J. Adv. Sci. Technol</source>
          <volume>29</volume>
          (
          <year>2020</year>
          )
          <fpage>8461</fpage>
          -
          <lpage>79</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D.</given-names>
            <surname>Vinoth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Prabhavathy</surname>
          </string-name>
          ,
          <article-title>An intelligent machine learning-based sarcasm detection and classification model on social networks</article-title>
          ,
          <source>The Journal of Supercomputing</source>
          <volume>78</volume>
          (
          <year>2022</year>
          )
          <fpage>10575</fpage>
          -
          <lpage>10594</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>V.</given-names>
            <surname>Govindan</surname>
          </string-name>
          ,
          <string-name>
            <surname>V. Balakrishnan,</surname>
          </string-name>
          <article-title>A machine learning approach in analysing the efect of hyperboles using negative sentiment tweets for sarcasm detection</article-title>
          ,
          <source>Journal of King Saud University-Computer and Information Sciences</source>
          <volume>34</volume>
          (
          <year>2022</year>
          )
          <fpage>5110</fpage>
          -
          <lpage>5120</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kalaivani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Thenmozhi</surname>
          </string-name>
          ,
          <article-title>Multilingual sentiment analysis in tamil malayalam and kannada code-mixed social media posts using mbert</article-title>
          .,
          <source>in: FIRE (Working Notes)</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>1020</fpage>
          -
          <lpage>1028</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>S.</given-names>
            <surname>Bellamkonda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lohakare</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Patel</surname>
          </string-name>
          ,
          <article-title>A dataset for detecting humor in Telugu social media text</article-title>
          ,
          <source>in: Proceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages, Association for Computational Linguistics</source>
          , Dublin, Ireland,
          <year>2022</year>
          , pp.
          <fpage>9</fpage>
          -
          <lpage>14</lpage>
          . URL: https://aclanthology.org/
          <year>2022</year>
          .dravidianlangtech-
          <volume>1</volume>
          .2. doi:
          <volume>10</volume>
          .18653/v1/
          <year>2022</year>
          . dravidianlangtech-
          <volume>1</volume>
          .2.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>