<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Forum for Information Retrieval Evaluation, December</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>IndicBERT based approach for Sentiment Analysis on Code-Mixed Tamil Tweets</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>R.Ramesh Kannan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ratnavel Rajalakshmi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lokesh Kumar</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Computer Science and Engineering, Vellore Institute of Technology</institution>
          ,
          <addr-line>Chennai, TamilNadu</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <volume>1</volume>
      <fpage>3</fpage>
      <lpage>17</lpage>
      <abstract>
        <p>Nowadays, Social media networks have made a huge impact in the lifestyle. Many people prefer to express their opinions on various topics in the social media platforms such as Facebook, Twitter etc. Even though, English is predominantly used by most of the people across the world to express their views, the technological advancements have paved a way for people to use their native language also to post their opinions. As many of the social media users are bilingual in nature, the trend of using a combination of English and native language has become a common scenario. Sentiment Analysis, the task of identifying the correct opinion from these Code-Mixed social media posts, is a challenging one, as the existing architectures and algorithms are designed to handle uni-lingual posts. The diversity and the rich linguistic nature of Indian languages demand highly sophisticated systems to address the above issues. In this work, we have conducted an experimental study to handle the challenges in Code-Mixed Tamil tweets and proposed a transformer based Indic-BERT approach. From the experimental results, we have shown that, an 1 score of 61.73% can be achieved, which is a significant improvement over the other traditional methods. This work has been submitted to the shared task on [1] Dravidian-CodeMixFIRE 2021.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Code-Mixed</kwd>
        <kwd>Sentiment Analysis</kwd>
        <kwd>Dravidian Language</kwd>
        <kwd>Tanglish</kwd>
        <kwd>Tamil</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Telugu are the Dravidian Languages spoken by people from Karnataka, Kerala and Andrapradesh.
Tamil is a Dravidian Languages which is spoken by people from India, Srilanka, and Tamil
diaspora around the world. Tamil is the oficial language in Singapore and Srilanka. These
languages are used by people for Various purpose like administration, education and business
media. However, people often their native language with Roman script for typing because it is
easy for the user to type the contents. Hence, the majority of the under-resourced languages in
social media are Code-Mixed in nature.</p>
      <p>Regional languages are used to share people opinions on the social media. Many of the
resources are developed/generated in Arabic [4],English and other regional languages. In the
technological world, people have ease to access internet and share Code-Mixed texts on the
internet platform. Texts needs to be understood at linguistic level and the lack of Code-Mixed
data to train the model is the challenging part during analysis. Monolingual trained system might
not be suitable for Code-Mixed data, since the linguistic structure is diferent for Code-Mixed
data.</p>
      <p>
        Shared task [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] released in Tanglish (Tamil+English) language with social media comments
for sentiment analysis on Code-Mixed data. Our proposed system reveals, how sentiment is
expressed in Code-Mixed scenarios on social media by applying transformer based approach
Indic-BERT[5] and we have obtained an  11 score of 61.73%. This paper is organized as follows:
Section 2 shows about the Related works that are carried out on the same domain, Section 3
discusses about the proposed methodology for the shared task, Section 4 deals with the results
obtained using the proposed methodology. Section 5 focuses on conclusion part of the study.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Works</title>
      <p>Sentiment polarity analysis on online medium like YouTube comments is an important problem
in analyzing people opinion on public, product, sports or on movies etc. Analyzing the polarity
on the online medium contents is a challenging task. Various authors [6, 7] carried out their
research on under resources languages. Sentiment analysis on online media contents[8],[9]
and social media contents [10] had been studied by various authors. Sentiment analysis on
movie reviews were studied by [8, 9]. Online movie review is done by [9]combining
Convolution Neural Network and Bidirectional Long Short Memory to identify the opinions on
movie contents as Hybrid approach. [8] shows the work implementation of feature weighting
method on online movie reviews. New Relevance Factor (NRF) weighting method [11] for text
classification using Naive Bayes classifier. [ 12] proposed universal dictionary method for text
classification on Uniform Resource Locator(URL)using Linear SVM. Text classification on legal
documents[13], context aware solution based on Cosine similarity approach and Term
Frequency - Inverse Document Frequency(TF-IDF) to obtain the similarity between the documents.
Attention mechanism is proposed with Recurrent Convolutional Neural Network(RCNN) [14]
for efective learning of text features on uniform resource locator. Deep learning architecture of
Convolutional Neural Network (CNN) is combined with Bidirectional Gated Recurrent Unit
(BGRU) [15] to extract the features for web page classifications. Sentiment movie reviews is
analysed with Long Short-term Memory (LSTM) with word embedding to extract the polarity
of the reviews with self attention based approach [16]. Sentiment analysis on Tweet contents
were analysed [10] by applying Maximum Entropy supervised approach and obtained 74% cross
validation accuracy score. A detailed survey on sentiment analysis was presented in work by
[17].</p>
      <p>The task on sentiment polarity identification on Code-Mixed data is challenging and recent
days works are reported on the Code-Mixed data sets. The authors in [18] proposed an ensemble
based machine learning approach on Code-Mixed data set. The authors proposed n-gram features
with machine learning to perform classification on Hindi- English and Bengali-English data set
and obtained a F1 score of 58% and 69% respectively. Ensemble classifier approach proposed
using CHI square feature selection approach[19] on Code-Mixed Hindi-German language using
Random Forest Classifier. Rajalakshmi. et al, [ 20] proposed BERT based approach on
CodeMixed data set for ofensive language identification by capturing linguistic features. The authors
obtained a validation F1 Score of 65% and testing F1 Score of 64%. Hate Speech analysis on
CodeMixed Marathi, Hindi data were analysed using Ensembled approach [21] Extreme Gradient
Boosting Code-Mixed Hindi, English were analysed for Hate Ofensive detection using
IndicBERT [22] with Majority voting approach for HASOC2021. To process multi-lingual queries
Code-Mixing and Code-Borrowing were studied in recent days [23, 24, 25]. Relevance metric[26]
based approach is proposed for borrowing likeliness of Hindi-English tweets for ranking. [5]
proposed a new multilingual ALBERT model based approach for some of the Indian languages.
Indic-BERT can be applied to various downstream tasks in Natural Language Processing. In this
study, we have applied Indic-BERT on Dravidian Code-Mixed data set for sentiment polarity
identifications.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Data Set Description</title>
      <p>Dravidian Code-Mixed data set is a collection of YouTube video comments, which contains
code mixed sentences and the types of Code-Mixed sentences are Inter-Sentential switch,
IntraSentential switch and Tag switch[27] . Almost all the comments were written in Tamil grammar
with English lexicon or English grammar with Tamil lexicon in native script and Roman scripts.
Few of the comments were in Tamil script with English expressions. Data set contains ID,text
and Label for each of the comments. Id contains unique number to identify particular row,
text contains YouTube comments and label shows the category of the text, which contains five
categories like Positive, Negative, not-Tamil, unknown state and mixed feelings.</p>
      <p>Example from Data set:
Original Text : Yarayellam FDFS paga ippove ready agitinga
Meaning : Who are all now ready for FDFS(FIrst Day First Show)- Positive category
Original Text : Ennada viswasam mersal sarkar madhri time la likes and views create
pannalayae - Negative Category</p>
      <p>Meaning : Why likes and views are not created for the films like viswasam, mersal,sarkar.
Negative Category</p>
      <p>The objective of the task is to identify sentiment polarity of the Code-Mixed data set of
comments or posts in Tamil+English collected from social media that contains any of the
following 5 category labels viz., Positive(Po), Negative(Ne), Mixed_feelings(Mf), not-Tamil(Nt),
unknown_state(Us). The data distribution is tabulated in Table 1. 56% of the comments are
positive and other remaining 44% of the comments are in other four categories. The percentage
of category values are as follows: Ne with 12%, Nt with 5%, Mf with 11% and Us with 16%. As
part of the sentiment analysis task, the training and validation set were released with 35656
and 3962 labelled social media comments. Both the training and validation set follows the same
distribution.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Proposed Methodology</title>
      <p>The Code-Mixed comments contains Tamil, English and other language phrases and words in
the context. Instead of converting the text into any of the common language, a Multilingual
pretrained model[5] Indic-BERT is used, that has been pretrained on 12 indian languages.
Indic-BERT pretrained model is based on ALBERT(A Lite BERT for Self-Supervised Learning of
Language Representations) model, which is a recent derivative of BERT(Bidirectional Encoder
Representations from Transformers), which is pretrained on 12 indian languages like Assamese,
Bengali, English, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, Telugu.
The proposed BERT has less parameters than other public models like mBERT and XLM-R while
it manages to give state of the art performance on several tasks.</p>
      <p>Since it is a pretrained multilingual transformer model, the data needs to converted into
corresponding embeddings for classifications. As a pre processing step, Autotokenizer tokenizes
all the sentences into tokens. In tokens, Class[CLS] token is added at the beginning of the
sentence and seperation[SEP] token is added at the end of the each sentences. Padding [Pad]
token is padded with all the sentences till the maximum length of the sentence. Assign unique id
to each token for further processing. Attention mask is also generated for each input sentences
and it tells which tokens should be attempted and which should not be attempted by the model
during training. This will be useful when input is fed into transformer based Indic-BERT model.</p>
      <p>To determine the sentiments expressed in the Code-Mixed YouTube comments/posts,
IndicBERT model is proposed with the fine tuned parameters [ 5]. Indic-BERT is a multilingual
representation model that extracts the context from diferent language input representations in
both the directions. To capture the semantic and linguistic features of a multilingual sentence,
Indic-BERT is applied. YouTube comments/Posts may have more than one sentences. Indic-BERT
has the ability to consider these multilingual inputs sentences into a single sequence for input
representations. Indic-BERT embeddings combines the token embedding,segment embedding
and positional embeddings. Pretrained model can be fine-tuned to suit the downstream tasks
by adding classification layer at the bottom of the model. Indic-BERT can be used for this task
of how sentiment is expressed in Code-Mixed scenarios on social media.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Results and Discussion</title>
      <p>To study the performance of the sentiment polarity of the system, we have conducted experiment
based on Indic-BERT approach on Code-Mixed data set. The experiment was conducted on
workstation with Intel Xeon Quad Core Processor, 32 GB RAM, NVIDIA Quadro P4000 GPU 8GB.
To capture the sentiment polarity on the Code-Mixed data set, we have tried transformer based
approach of Indic-BERT. To attain better performance of the BERT model, we have fine-tuned
the parameters and obtained learning rate=3e-5,batch size=64, epochs=5. Figure 1, shows the
accuracy graph on training data and validation data. For the 5th epoch accuracy score reached a
maximum level. Figure 2, plotted with loss values on training data and validation data. Obtained
a training accuracy of 69.91% and loss of 0.6939. For the validation set, obtained a loss of 0.9681
and accuracy of 63.29%. Here the classifier is able to classify all the categories, even the data
set is not balanced set. Even the data set contains very less number of Not Tamil categories
are classified correctly. From Table 3, Our proposed model out performs on Weighted average
F1 score of 61.73%. The model is able to classify all the categories irrespective of the specific
category.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>There is an increase in social media contents in recent days. The goal of the
Dravidian-CodeMixFIRE 2021 is to identify the subjective opinions or emotional responses of the social media
comments. In this work, we have presented the challenges involved in extracting the key
terms to identify the opinion from the Code-Mixed tweets. A detailed experimental study has
been performed using diferent architectures and we found that, the sentiment in the social
media contents are better captured using the Indic-BERT language model. We have obtained
a weighted 1 score of 61.73% with the proposed model. We observed that, the data set is
skewed and the lack of enough samples for every category has impacted the performance
of the classifier. In our future work, we planned to address the class imbalance problem for
Code-Mixed sentiment analysis.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>The authors would like to thank the management of Vellore Institute of Technology, Chennai
for providing the support to carry out this work. We would like to thank the Department of
Science and Engineering Research Board (SERB),Government of India for their financial grant
(Award Number: ECR/2016/00484) for this research work.
[4] H. Mubarak, K. Darwish, W. Magdy, Abusive language detection on Arabic social
media, in: Proceedings of the First Workshop on Abusive Language Online,
Association for Computational Linguistics, Vancouver, BC, Canada, 2017, pp. 52–56. URL:
https://aclanthology.org/W17-3008.
[5] D. Kakwani, A. Kunchukuttan, S. Golla, G. N.C., A. Bhattacharyya, M. M. Khapra, P. Kumar,
IndicNLPSuite: Monolingual corpora, evaluation benchmarks and pre-trained multilingual
language models for Indian languages, in: Findings of the Association for Computational
Linguistics: EMNLP 2020, Association for Computational Linguistics, Online, 2020, pp.
4948–4961. URL: https://aclanthology.org/2020.findings-emnlp.445.
[6] S. Thavareesan, S. Mahesan, Sentiment lexicon expansion using word2vec and fasttext for
sentiment prediction in tamil texts, 2020. doi:10.1109/MERCon50084.2020.9185369.
[7] S. Thavareesan, S. Mahesan, Sentiment analysis in tamil texts: A study on machine
learning techniques and feature representation, 2019 14th Conference on Industrial and
Information Systems (ICIIS) (2019) 320–325.
[8] S. Sivakumar, R. Rajalakshmi, Comparative evaluation of various feature weighting
methods on movie reviews, in: H. S. Behera, J. Nayak, B. Naik, A. Abraham (Eds.),
Computational Intelligence in Data Mining, Springer Singapore, Singapore, 2019, pp.
721–730.
[9] S. Soubraylu, R. Rajalakshmi, Hybrid convolutional bidirectional
recurrent neural network based sentiment analysis on movie reviews,
Computational Intelligence 37 (2021) 735–757. URL: https://onlinelibrary.wiley.com/
doi/abs/10.1111/coin.12400. doi:https://doi.org/10.1111/coin.12400.
arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/coin.12400.
[10] A. Samuels, J. Mcgonical, Sentiment analysis on social media content, CoRR abs/2007.02144
(2020). URL: https://arxiv.org/abs/2007.02144. arXiv:2007.02144.
[11] R. R., Supervised term weighting methods for url classification, Journal of Computer</p>
      <p>Science 10 (2014). doi:10.3844/jcssp.2014.1969.1976.
[12] R. R., C. Aravindan, An efective and discriminative feature learning for url based web page
classification, in: 2018 IEEE International Conference on Systems, Man, and Cybernetics
(SMC), 2018, pp. 1374–1379. doi:10.1109/SMC.2018.00240.
[13] R. R. Kannan, R. Rajalakshmi, Dlrg@aila 2019: Context - aware legal assistance system, in:
P. Mehta, P. Rosso, P. Majumder, M. Mitra (Eds.), Working Notes of FIRE 2019 - Forum
for Information Retrieval Evaluation, Kolkata, India, December 12-15, 2019, volume 2517
of CEUR Workshop Proceedings, CEUR-WS.org, 2019, pp. 58–63. URL: http://ceur-ws.org/
Vol-2517/T1-10.pdf.
[14] R. R., H. Tiwari, J. Patel, R. R., K. Ramamurthy, Bidirectional GRU-Based Attention Model
for Kid-Specific URL Classification, 2020, pp. 78–90. doi: 10.4018/978-1-7998-1192-3.
ch005.
[15] R. Rajalakshmi, H. Tiwari, J. Patel, A. Kumar, R. Karthik., Design of kids-specific url
classiifer using recurrent convolutional neural network, Procedia Computer Science 167 (2020)
2124–2131. URL: https://www.sciencedirect.com/science/article/pii/S1877050920307262.
doi:https://doi.org/10.1016/j.procs.2020.03.260, international Conference
on Computational Intelligence and Data Science.
[16] S. Soubraylu, R. Rajalakshmi, Analysis of sentiment on movie reviews using word
embedding self-attentive lstm, International Journal of Ambient Computing and Intelligence 12
(2021) 33–52. doi:10.4018/IJACI.2021040103.
[17] V. Ganganwar, R. Rajalakshmi, Implicit aspect extraction for sentiment analysis: A survey
of recent approaches, Procedia Computer Science 165 (2019) 485–491.
[18] P. Mishra, P. Danda, P. Dhakras, Code-mixed sentiment analysis using machine learning
and neural network approaches, CoRR abs/1808.03299 (2018). URL: http://arxiv.org/abs/
1808.03299. arXiv:1808.03299.
[19] R. Rajalakshmi, B. Y. Reddy, Dlrg@hasoc 2019: An enhanced ensemble classifier for hate
and ofensive content identification, in: P. Mehta, P. Rosso, P. Majumder, M. Mitra (Eds.),
Working Notes of FIRE 2019 - Forum for Information Retrieval Evaluation, Kolkata, India,
December 12-15, 2019, volume 2517 of CEUR Workshop Proceedings, CEUR-WS.org, 2019,
pp. 370–379. URL: http://ceur-ws.org/Vol-2517/T3-26.pdf.
[20] R. Rajalakshmi, Y. Reddy, L. Kumar, DLRG@DravidianLangTech-EACL2021:
Transformer based approachfor ofensive language identification on code-mixed Tamil, in:
Proceedings of the First Workshop on Speech and Language Technologies for
Dravidian Languages, Association for Computational Linguistics, Kyiv, 2021, pp. 357–362. URL:
https://aclanthology.org/2021.dravidianlangtech-1.53.
[21] R. Rajalakshmi, S. Srivarshan, M. L. P. R. Faerie, M. Faerie, K. E, S. Prithvi, K. M. Anand,
Conversational hate-ofensive detection in code-mixed hindi-english tweets, Association
for Computing Machinery, 2021.
[22] R. Rajalakshmi, L. P. Reddy, M. Faerie, S. Srivarshan, K. M. Anand, Hate speech and
ofensive content identification in hindi and marathi languages using ensemble techniques,
Association for Computing Machinery, 2021.
[23] B. R. Chakravarthi, M. Arcan, J. P. McCrae, Improving wordnets for under-resourced
languages using machine translation, in: Proceedings of the 9th Global Wordnet Conference,
Global Wordnet Association, Nanyang Technological University (NTU), Singapore, 2018,
pp. 77–86. URL: https://aclanthology.org/2018.gwc-1.10.
[24] WordNet Gloss Translation for Under-resourced Languages using Multilingual Neural</p>
      <p>Machine Translation, Zenodo, 2019. URL: https://doi.org/10.18653/v1/w19-7101.
[25] B. R. Chakravarthi, R. Priyadharshini, V. Muralidaran, S. Suryawanshi, N. Jose, E. Sherly,
J. P. McCrae, Overview of the track on sentiment analysis for Dravidian languages in
code-mixed text, in: Forum for Information Retrieval Evaluation, FIRE 2020, Association
for Computing Machinery, New York, NY, USA, 2020, p. 21–24. URL: https://doi.org/10.
1145/3441501.3441515.
[26] R. Rajalakshmi, R. Agrawal, Borrowing likeliness ranking based on relevance factor,
in: Proceedings of the Fourth ACM IKDD Conferences on Data Sciences, CODS ’17,
Association for Computing Machinery, New York, NY, USA, 2017. URL: https://doi.org/10.
1145/3041823.3067694. doi:10.1145/3041823.3067694.
[27] B. R. Chakravarthi, V. Muralidaran, R. Priyadharshini, J. P. McCrae, Corpus creation for
sentiment analysis in code-mixed Tamil-English text, in: Proceedings of the 1st Joint
Workshop on Spoken Language Technologies for Under-resourced languages (SLTU)
and Collaboration and Computing for Under-Resourced Languages (CCURL), European
Language Resources association, Marseille, France, 2020, pp. 202–210. URL: https://www.
aclweb.org/anthology/2020.sltu-1.28.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Priyadharshini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Thavareesan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chinnappa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Durairaj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Sherly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>McCrae</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hande</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ponnusamy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Banerjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Vasantharajan</surname>
          </string-name>
          ,
          <source>Findings of the Sentiment Analysis of Dravidian Languages in Code-Mixed Text</source>
          <year>2021</year>
          , in: Working Notes of FIRE 2021 -
          <article-title>Forum for Information Retrieval Evaluation</article-title>
          ,
          <string-name>
            <surname>CEUR</surname>
          </string-name>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Kumaresan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sakuntharaj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Madasamy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Thavareesan</surname>
          </string-name>
          , P. B,
          <string-name>
            <given-names>S. Chinnaudayar</given-names>
            <surname>Navaneethakrishnan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>McCrae</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <article-title>Overview of the HASOC-Dravidian CodeMix Shared Task on Ofensive Language Detection in Tamil and Malayalam</article-title>
          , in: Working Notes of FIRE 2021 -
          <article-title>Forum for Information Retrieval Evaluation</article-title>
          ,
          <string-name>
            <surname>CEUR</surname>
          </string-name>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R.</given-names>
            <surname>Priyadharshini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Thavareesan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chinnappa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Durairaj</surname>
          </string-name>
          , E. Sherly,
          <article-title>Overview of the Dravidiancodemix 2021 shared task on sentiment detection in tamil, malayalam, and kannada, in: Forum for Information Retrieval Evaluation</article-title>
          ,
          <string-name>
            <surname>FIRE</surname>
          </string-name>
          <year>2021</year>
          ,
          <article-title>Association for Computing Machinery</article-title>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>