<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>ECMAG - Ensemble of CNN and Multi-Head Attention with Bi-GRU for Sentiment Analysis in Code-Mixed Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dhanasekaran Prasannakumaran</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jappeswaran Balasubramanian Sideshwar</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Durairaj Thenmozhi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science and Engineering, SSN College of Engineering</institution>
          ,
          <addr-line>Chennai</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>People spend a considerable amount of time on social media platforms consuming information. They share their views and opinions about the subject they consume. The responses could be shared as posts in Facebook and Twitter or through comments on YouTube and the polarity of these posts could be positive or negative or unbiased. The posts or comments in social media are largely present as Romanized English format of multiple languages, commonly referred as code-mixed text. In this work, the authors propose an ensemble framework - Ensemble of Convolutional Neural Network and Multi-Head Attention with Bidirectional GRU (ECMAG)1 to map the code-mixed user comments to their corresponding sentiments. The performance of the framework is tested on the Tamil-English Code mixed dataset provided in Dravidian CodeMix FIRE 2021 - Sentiment Analysis for Dravidian Languages in Code-Mixed Text task. The authors use the pre-trained XLM-R model to generate the sub-word embeddings. ECMAG consists of 2 components - Convolutional Neural Network for Texts (CNNT) and Multi-Head Attention pipelined to Bi-GRU (MHGRU). The proposed architecture achieved a F1-score of 0.411.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Sentimental Analysis</kwd>
        <kwd>Code-Mixed text</kwd>
        <kwd>Transformers</kwd>
        <kwd>NLP</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>The onset of digitization has deemed social media to be a major platform for expressing one’s
thoughts. Social media platforms like YouTube, Twitter, Facebook, Instagram are used by over
4.4 billion users every day. The amount of information available and accessible is increasing
exponentially by the day. Users engage, express and exchange opinions on a subject that
interests them. Sentimental analysis aims to identify the polarity of the user’s opinion.</p>
      <p>With about 122 million daily active users on YouTube consuming more than a billion hours
of video content every day, YouTube is one the most widely used social media platform in the
world. Users post their views on a video they watched on the comment section. These comments
are from a diverse group of people and hence are written in multiple languages. People prefer to
use Romanized form of their regional languages to share their thoughts in social media which
helps them to easily express their opinions. This results in mixing the vocabulary and syntax of
multiple languages in the same sentence which is known as a code-mixed text.</p>
      <p>
        Research studies have been carried out to identify sentiments from monolingual text [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
Recently, the task of sentimental analysis has extended to code-mixed data and has attracted
the research fraternity. In this work, the authors aim to classify the sentiments of YouTube
comments in the Tamil-English code-mixed dataset which is part of the ‘Dravidian-CodeMix
- FIRE 2021 : Sentiment Analysis for Dravidian Languages in Code-Mixed Text‘ task [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The
dataset provided consists of code-mixed YouTube comments in Dravidian languages – a family of
languages (Tamil, Telugu, Malayalam and Kannada) spoken by 220 million people predominantly
in Southern India and Sri Lanka. The vocabulary of these languages are mixed with English
to produce the code-mixed text. In this work, the authors propose an ensemble architecture
that uses a convolutional neural network and an attention mechanism which is pipelined
to a Bidirectional gated recurrent unit layer to classify the comments into one of the given
sentiments.
      </p>
      <p>The course of this work is organized as follows. Section 2 elaborates the prominent works in
Sentimental analysis of code-mixed data. The details of the dataset used in this work are given
in Section 3. The data preprocessing pipeline is presented in Section 4. Section 5 depicts the
architecture and elucidates its components. The results of the work are illustrated in Section 6.
Finally the authors conclude and discuss the future scope of this work in Section 7.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Various approaches using Machine Learning (ML) and Deep Learning (DL) have been proposed
to solve the task of Sentiment Analysis (SA). Mohammad et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] adopted an ML approach
to detect the sentiments of tweets and messages with surface-form, semantic, and sentiment
features using a SVM classifier. Giatsoglou et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] proposed a polarity classification model
that used hybrid feature vectorization process incorporating lexicon-based features and word
embedding based approaches. They employed a SVM classifier with a linear kernel for the
classification task.
      </p>
      <p>
        Designing accurate SA models for multilingual code-mixed text unlike monolingual texts
is extremely challenging. Vyas et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] explored diferent approaches for POS tagging of
code-mixed data obtained from Facebook and Twitter. Sharma et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] leveraged various
lexicon based approaches for normalization of Hindi-English code-mixed text. A deep learning
approach was adopted by Joshi et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], which uses a LSTM to learn sub-word representations
to extract the sentiment value of morpheme-like structures. Choudhary et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] proposed a
Siamese Network architecture comprising twin Bidirectional LSTM networks that projects the
sentences of code-mixed and standard languages to a common sentiment space. Lal et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]
proposed a hybrid approach that combines dual encoder RNNs utilizing attention mechanisms,
with surface features, yielding a uniefid representation of code-mixed data for SA. Additionally
there has been active research in Ofensive language Identification and Hate speech detection
on code-mixed social media data [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>
        Yadav et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] proposed a zero-shot learning approach that uses cross-lingual and
multilingual embeddings which achieved state-of-the-art scores in Spanish-English code-mixed
SA. XML, a state-of-the-art cross-lingual model which learns cross lingual representations in
an unsupervised fashion, was proposed by Lample and Conneau [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. To further improve the
performance of XLM, Conneau et al. [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] scaled the size of the model and the data required
for pretraining. This resulted in a cross-lingual language model XLM-RoBERTa, a Transformer
based masked language model trained on one hundred languages which significantly
outperformed Multilingual-BERT(mBERT) [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] and the previous XLM models on a variety of
cross-lingual benchmarks. The authors of the papers use the pretrained XLM-RoBERTa model
to generate sub-word embeddings for the cross-lingual (Tamil-English) code-mixed data.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Dataset</title>
      <p>
        For this work, the authors used the data available in the Dravidian-CodeMix FIRE 2021 [
        <xref ref-type="bibr" rid="ref15 ref16">15, 16</xref>
        ]
database. The data was obtained by crawling Youtube comments. The database contains three
diferent datasets – Tamil-English (Tanglish), Malayalam-English (Manglish) and
KannadaEnglish (Kanglish). Each of the dataset consists of 3 types of code mixed sentences –
InterSentential switch, Intra-Sentential switch and Tag switching. The comments are mapped to 5
diferent labels; Positive, Negative, Mixed Feeling, Unknown state and Unintended language.
The authors of this work aim to predict the sentiments of Tamil-English code-mixed text. The
summary of the dataset is illustrated in Table 1.
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Data Preprocessing</title>
      <p>The code mixed data provided is extremely noisy. It contains repeated words, emojis,
unaccounted words (i.e. words not available in the English dictionary), hashtags, user mentions
and obscene words. To handle the inconsistency, the authors propose an extensive data
cleaning/preprocessing pipeline to process the raw text.</p>
      <p>
        The authors use Ekphrasis [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] : a collection of lightweight text tools primarily built for
processing text data from social medial platforms like Twitter and Facebook. This tool is used for
word normalization, word segmentation (for splitting hashtags) and spell corrections. Numbers,
hashtags, all caps, extended, repeated and censored words are annotated appropriately.
      </p>
      <p>
        The text is processed serially and the steps involved in preprocessing is illustrated in Figure
1. Firstly, the sentence is tokenized and the English characters are converted to lower case. The
emoji library [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] is used to convert the pictogram (emoji) to words that describe the emotion.
Next, the word is checked for its presence in the English dictionary. If found, the word is
processed using the Ekphrasis [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] tool. Otherwise, it indicates that the text is either in
codemixed form or in a foreign language. Further, this word is transliterated to its corresponding
Dravidian script (Tamil) which is carried out using the google transliteration tool [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. The
sentences that correspond to the unintended language category are not processed in the proposed
pipeline.
      </p>
      <p>Hence, a refined text is obtained with either only English words or Tamil words or both. This
pipeline therefore mitigates the noise present in code-mixed data. Figure 4 illustrates the text
before and after preprocessing.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Architecture</title>
      <p>The processed text comprises of other languages’s and/or English script. To obtain the word
embeddings of multilingual text, the authors used the XLM-RoBERTa (XLM-R) model. XLM-R
is a transformer-based masked language model trained on one hundred languages. In this work,
xlm-roberta-base model was used. The pre-processed text is tokenized into sub-words using
the XLM-R vocabulary. The IDs of these sub-words are then fed to a XLM-R encoder module to
obtain the sub-word embeddings which are used as inputs for the proposed architecture.</p>
      <p>The authors propose an ensemble framework ECMAG (illustrated in Fig 3) which consists of
2 components – Convoloutional Neural Network for Texts (CNNT) and Multi-Head Attention
pipelined to Bi-GRU (MHGRU). The details of the components are elucidated in the following
sections.</p>
      <sec id="sec-5-1">
        <title>5.1. Convoloutional Neural Network for Texts (CNNT)</title>
        <p>
          The first component is a Convolutional Neural Network (CNN). Several researches [
          <xref ref-type="bibr" rid="ref20 ref21 ref22">20, 21, 22</xref>
          ]
have considered using CNN for text classification. CNN was used since it takes into account the
ordering of the words and the context in which each word occurs. The sub-word embeddings
from XLM-R are passed through a 2D CNN. In this work, the authors considered using 5 filters
of diferent sizes (3, 4, 5, 7, 9). The outputs from the individual 2D CNNs are passed through a
max pooling layer. Finally, the outputs from the pooling layers are concatenated and passed
through a fully connected layer and the output prediction  from this component is
obtained.
        </p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Multi-Head Bi-GRU (MHGRU)</title>
        <p>
          Attention mechanism can be described as the weighted average of (sequence) elements with
weights dynamically computed based on an input query and element’s key. Query (Q)
corresponds to the sequence for which attention is paid. Key (K) is the vector used to identify the
elements that require more attention based on Q. The attention weights are averaged to obtain
the value vector (V). A score function (1) is used to determine the elements which require more
attention. The score function takes Q and K as input and outputs the attention weight of the
query-key pair. In this work the authors consider using the scaled dot product proposed by
Vaswani et al. [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ].
        </p>
        <p>(, ,  ) =  ( √ )
(1)</p>
        <p>
          The scaled dot product attention allows the deep learning network to attend over a sequence.
However, often there are multiple diferent aspects to a sequence, and these characteristics
cannot be captured by a single weighted average vector. Therefore the authors employed
Multi-Head Attention (MHA) [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ] which uses multiple diferent query-key-value triplets (heads)
on the same features. Self-Attention (used in this work) first introduced by Luong et al. [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ] is
an attention mechanism relating diferent positions of a single sequence in order to compute a
representation of the same sequence. Since self-attention was used Q, K and V are initialized
with the same sentence (sequence) and the corresponding matrices are transformed into 
subqueries, sub-keys and sub-values and are then passed through the scaled dot product (Equation
(1)) attention independently. The attention outputs from each head are then combined and the
ifnal weight matrix (  )is calculated.
        </p>
        <p>The output from the MHA layer is then pipe-lined through a Bi-directional GRU layer. The
output from the Bi-GRU layer is then passed through a fully connected layer and finally through
a Softmax layer to generate the predictions. Thus, the output prediction  from this
component is obtained.</p>
        <p>(, ,  ) = (ℎ1, . . . , ℎ) 
where ℎ =  (,  ,   ),
 ,   ,   are the weight matrices of Q, K and V respectively</p>
        <p>The output predictions from each of the components are concatenated and passed through a
fully connected layer to obtain the final prediction as illustrated in Equation (3) .</p>
        <p>F : ∆(  ⊕  ) →</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Results</title>
      <p>
        Experimental Settings : The performance of ECMAG is evaluated based on weighted averaged
Precision, weighted averaged Recall and weighted averaged F-Score. The following are the
hyper-parameter settings used in ECMAG: maximum sequence length : 64, batch size : 128,
CNN output dimension : 5, dropout : 0.3, number of filters : 100, filter sizes : [
        <xref ref-type="bibr" rid="ref3 ref4 ref5 ref7 ref9">3, 4, 5, 7, 9</xref>
        ], loss
function: cross entropy loss, optimizer : Adam, word embedding dimension : 768, GRU hidden
size : 32.
      </p>
      <p>Table 2 illustrate the validation results obtained using ECMAG. To validate the importance of
the components proposed in the architecture, the results obtained from individual components
are also listed in Table 2. The proposed model achieved the following scores on the test data as
illustrated in Table 3.
(2)
(3)</p>
      <p>As the proposed architecture uses word embeddings from a pre-trained XLM-RoBERTa model
without fine tuning it to the dataset in hand, the reported scores are only closer to the baseline
scores of the task. Fine tuning ECMAG to the given code-mixed dataset would indeed help in
capturing the finer meanings and contexts of the sub-words in their embeddings, which in turn
would enhance the performance of the model.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion</title>
      <p>Model
Framework</p>
      <p>Precision
0.382</p>
      <p>Test
Recall F1 score
0.449 0.411
In this work, the authors propose and successfully test an ensemble architecture – ECMAG on
the Tamil-English code-mixed dataset to identify the sentiment expressed in YouTube comments.
XLM-RoBERTa model was used to obtain the sub-word embedding which was used as inputs to
each of the components. ECMAG achieved the following scores: Precision : 0.382, Recall : 0.449
and F1 score : 0.411 on the test data. For future work, the authors aim to process the text further
to handle diferent dialects and slang in Dravidian languages. Fine-tuning the XLM-RoBERTa
pre-trained model for the task in hand is another prospective area of work to improve the
performance of the model. Additionally the authors aim to tackle the native imbalance present
in the dataset between categories. The authors also suggest building an interpretable machine
learning model to provide insights on what basis the predictions (sentiments) were made.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Banerjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. Raja</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>McCrae</surname>
          </string-name>
          ,
          <article-title>Comparison of pretrained embeddings to identify hate speech in indian code-mixed text</article-title>
          ,
          <source>in: 2020 2nd International Conference on Advances in Computing, Communication Control and Networking (ICACCCN)</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>21</fpage>
          -
          <lpage>25</lpage>
          . doi:
          <volume>10</volume>
          .1109/ICACCCN51052.
          <year>2020</year>
          .
          <volume>9362731</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Priyadharshini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Thavareesan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chinnappa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Durairaj</surname>
          </string-name>
          , E. Sherly,
          <article-title>Overview of the dravidiancodemix 2021 shared task on sentiment detection in tamil, malayalam, and kannada, in: Forum for Information Retrieval Evaluation</article-title>
          ,
          <string-name>
            <surname>FIRE</surname>
          </string-name>
          <year>2021</year>
          ,
          <article-title>Association for Computing Machinery</article-title>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Mohammad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kiritchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <article-title>Nrc-canada: Building the state-of-the-art in sentiment analysis of tweets</article-title>
          ,
          <source>CoRR abs/1308</source>
          .6242 (
          <year>2013</year>
          ). URL: http://arxiv.org/abs/1308.6 242. arXiv:
          <volume>1308</volume>
          .
          <fpage>6242</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Giatsoglou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Vozalis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Diamantaras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vakali</surname>
          </string-name>
          , G. Sarigiannidis,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chatzisavvas</surname>
          </string-name>
          ,
          <article-title>Sentiment analysis leveraging emotions and word embeddings</article-title>
          ,
          <source>Expert Syst. Appl</source>
          .
          <volume>69</volume>
          (
          <year>2017</year>
          )
          <fpage>214</fpage>
          -
          <lpage>224</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Vyas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Bali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Choudhury</surname>
          </string-name>
          ,
          <article-title>Pos tagging of english-hindi code-mixed social media content</article-title>
          ,
          <source>in: EMNLP</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Srinivas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. C.</given-names>
            <surname>Balabantaray</surname>
          </string-name>
          ,
          <article-title>Text normalization of code mix and sentiment analysis</article-title>
          ,
          <source>in: 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI)</source>
          ,
          <year>2015</year>
          , pp.
          <fpage>1468</fpage>
          -
          <lpage>1473</lpage>
          . doi:
          <volume>10</volume>
          .1109/ICACCI.
          <year>2015</year>
          .
          <volume>7275819</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Prabhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Shrivastava</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Varma</surname>
          </string-name>
          ,
          <article-title>Towards sub-word level compositions for sentiment analysis of hindi-english code mixed text (</article-title>
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>N.</given-names>
            <surname>Choudhary</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Bindlish</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Shrivastava</surname>
          </string-name>
          ,
          <article-title>Sentiment analysis of code-mixed languages leveraging resource rich languages</article-title>
          , CoRR abs/
          <year>1804</year>
          .00806 (
          <year>2018</year>
          ). URL: http: //arxiv.org/abs/
          <year>1804</year>
          .00806. arXiv:
          <year>1804</year>
          .00806.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Y. K.</given-names>
            <surname>Lal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dhar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Shrivastava</surname>
          </string-name>
          , P. Koehn,
          <article-title>De-mixing sentiment from code-mixed text</article-title>
          ,
          <source>in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop</source>
          , Association for Computational Linguistics, Florence, Italy,
          <year>2019</year>
          , pp.
          <fpage>371</fpage>
          -
          <lpage>377</lpage>
          . URL: https://aclanthology.org/P19-2052. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>P19</fpage>
          -2052.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>K.</given-names>
            <surname>Yasaswini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Puranik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hande</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Priyadharshini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Thavareesan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          , IIITT@DravidianLangTech-EACL2021:
          <article-title>Transfer learning for ofensive language detection in Dravidian languages</article-title>
          ,
          <source>in: Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, Association for Computational Linguistics</source>
          , Kyiv,
          <year>2021</year>
          , pp.
          <fpage>187</fpage>
          -
          <lpage>194</lpage>
          . URL: https://aclanthology.org/
          <year>2021</year>
          .dravidianlangtech-
          <volume>1</volume>
          .
          <fpage>25</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>S.</given-names>
            <surname>Yadav</surname>
          </string-name>
          , T. Chakraborty,
          <article-title>Zera-shot sentiment analysis for code-mixed data</article-title>
          ,
          <source>Proceedings of the AAAI Conference on Artificial Intelligence</source>
          <volume>35</volume>
          (
          <year>2021</year>
          )
          <fpage>15941</fpage>
          -
          <lpage>15942</lpage>
          . URL: https: //ojs.aaai.org/index.php/AAAI/article/view/17967.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>G.</given-names>
            <surname>Lample</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Conneau</surname>
          </string-name>
          ,
          <article-title>Cross-lingual language model pretraining</article-title>
          , CoRR abs/
          <year>1901</year>
          .07291 (
          <year>2019</year>
          ). URL: http://arxiv.org/abs/
          <year>1901</year>
          .07291. arXiv:
          <year>1901</year>
          .07291.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Conneau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Khandelwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Chaudhary</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Wenzek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Guzmán</surname>
          </string-name>
          , E. Grave,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          ,
          <article-title>Unsupervised cross-lingual representation learning at scale</article-title>
          , CoRR abs/
          <year>1911</year>
          .02116 (
          <year>2019</year>
          ). URL: http://arxiv.org/abs/
          <year>1911</year>
          .02116. arXiv:
          <year>1911</year>
          .02116.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , Bert:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          ,
          <source>in: NAACL</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Priyadharshini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Thavareesan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chinnappa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Durairaj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Sherly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>McCrae</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hande</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ponnusamy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Banerjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Vasantharajan</surname>
          </string-name>
          ,
          <source>Findings of the Sentiment Analysis of Dravidian Languages in Code-Mixed Text</source>
          <year>2021</year>
          , in: Working Notes of FIRE 2021 -
          <article-title>Forum for Information Retrieval Evaluation</article-title>
          ,
          <string-name>
            <surname>CEUR</surname>
          </string-name>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Priyadharshini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Muralidaran</surname>
          </string-name>
          , N. Jose,
          <string-name>
            <given-names>S.</given-names>
            <surname>Suryawanshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Sherly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>McCrae</surname>
          </string-name>
          ,
          <article-title>Dravidiancodemix: Sentiment analysis and ofensive language identification dataset for dravidian languages in code-mixed text</article-title>
          ,
          <source>CoRR abs/2106</source>
          .09460 (
          <year>2021</year>
          ). URL: https://arxiv.org/abs/2106.09460. arXiv:
          <volume>2106</volume>
          .
          <fpage>09460</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>C.</given-names>
            <surname>Baziotis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Pelekis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Doulkeridis</surname>
          </string-name>
          , Datastories at semeval
          <article-title>-2017 task 4: Deep lstm with attention for message-level and topic-based sentiment analysis</article-title>
          ,
          <source>in: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)</source>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Vancouver, Canada,
          <year>2017</year>
          , pp.
          <fpage>747</fpage>
          -
          <lpage>754</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>T.</given-names>
            <surname>Kim</surname>
          </string-name>
          , Emoji,
          <year>2014</year>
          . URL: https://github.com/carpedm20/emoji/.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>G. NC</surname>
          </string-name>
          ,
          <article-title>Googletrans: Free and unlimited google translate api for python. translates totally free of charge</article-title>
          .,
          <year>2020</year>
          . URL: https://py-googletrans.readthedocs.io/en/latest/.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>S.</given-names>
            <surname>Lai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <article-title>Recurrent convolutional neural networks for text classification</article-title>
          ,
          <source>in: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence</source>
          , AAAI'
          <fpage>15</fpage>
          , AAAI Press,
          <year>2015</year>
          , p.
          <fpage>2267</fpage>
          -
          <lpage>2273</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , J. Yan,
          <article-title>Combining knowledge with deep convolutional neural networks for short text classification</article-title>
          ,
          <source>in: Proceedings of the 26th International Joint Conference on Artificial Intelligence, IJCAI'17</source>
          , AAAI Press,
          <year>2017</year>
          , p.
          <fpage>2915</fpage>
          -
          <lpage>2921</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>S.</given-names>
            <surname>Moriya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Shibata</surname>
          </string-name>
          ,
          <article-title>Transfer learning method for very deep cnn for text classification and methods for its evaluation</article-title>
          ,
          <source>in: 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC)</source>
          , volume
          <volume>02</volume>
          ,
          <year>2018</year>
          , pp.
          <fpage>153</fpage>
          -
          <lpage>158</lpage>
          . doi:
          <volume>10</volume>
          .1109/COMP SAC.
          <year>2018</year>
          .
          <volume>10220</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kaiser</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Polosukhin</surname>
          </string-name>
          , Attention is all you need,
          <year>2017</year>
          . arXiv:
          <volume>1706</volume>
          .
          <fpage>03762</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>M.-T. Luong</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Pham</surname>
            ,
            <given-names>C. D.</given-names>
          </string-name>
          <string-name>
            <surname>Manning</surname>
          </string-name>
          ,
          <article-title>Efective approaches to attention-based neural machine translation</article-title>
          ,
          <year>2015</year>
          . arXiv:
          <volume>1508</volume>
          .
          <fpage>04025</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>