<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Forum for Information Retrieval Evaluation, December</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Sentiment Analysis of Dravidian-CodeMix Language</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ankit Kumar Mishra</string-name>
          <email>ankitmishra.co.in@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sunil Saumya</string-name>
          <email>sunil.saumya@iiitdwd.ac.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Abhinav Kumar</string-name>
          <email>abhinavanand05@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Indian Institute of Information Technology Dharwad</institution>
          ,
          <addr-line>Karnataka</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Siksha 'O' Anusandhan Deemed to be University</institution>
          ,
          <addr-line>Bhubaneswar</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <volume>1</volume>
      <fpage>3</fpage>
      <lpage>17</lpage>
      <abstract>
        <p>The computational examination of people's opinions, attitudes, and emotions conveyed in written language is known as sentiment analysis or opinion mining. In recent years, it has become one of the most active study fields in natural language processing and text mining. Sentiment analysis of social media texts, which are predominantly code-mixed for Dravidian languages, is becoming more popular. In a multilingual community, code-mixing is common, and code-mixed writings are produced using native and non-native scripts. The current paper uses machine learning, deep learning, and parallel hybrid deep earning models to identify sentiments in Dravidian code-mixed social media text. The experiments were conducted using a dataset from the Dravidian-CodeMix-FIRE 20211 competition, which included YouTube comments in Tamil, Malayalam, and Kannada code-mixed languages. Sentiment Analysis, code-mixed, deep learning, machine learning, Dravidian languages, CEUR-WS Human language is a dificult beast to grasp. It is tough to teach a machine to recognize the diferent linguistic nuances, cultural variances, slang, and misspellings that appear in social media discussions. It's considerably more dificult to teach a machine to recognize how context afects tone. When it comes to interpreting the tone of a piece of literature, humans are fairly intuitive. Consider this sentence: “I love waiting for the doctor”. Most people would immediately recognize that the preceding sentence is mocking. We understand that everyone must wait in line to see a doctor, but we all despise waiting. We can easily identify the sentiment as negative by using this contextual information in the statement. However, a machine reading the statement above might recognize the word “love” and categorize it as positive without knowing the context.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        to hear other people’s perspectives. For example, if we want to buy a product from an online
platform or a physical store, we first gather information about it, such as reading peer buyer
reviews and viewing videos, to determine whether the product is right for us [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]. Due to
the massive amount of data available on the internet in various forms, manually identifying
sentiments is nearly impossible. As a result, a trained machine is required to read the text and
automatically return its sentiment [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ].
      </p>
      <p>
        The majority of sentiment analysis research has taken place in the last one or two decades
[
        <xref ref-type="bibr" rid="ref10 ref11 ref12 ref13 ref14 ref8 ref9">8, 9, 10, 11, 12, 13, 14, 15, 16</xref>
        ]. The cause for this could be the shift in our business from
physical to virtual platforms [17]. The participation of social media and e-commerce among
virtual platforms has compelled us to build a variety of sentiment analysis solutions for distinct
circumstances [18]. Because of its importance to business and society as a whole, sentiment
analysis research has moved beyond computer science to management sciences and social
sciences.
      </p>
      <p>Recently, there has been a surge in interest in sentiment analysis of social media posts and
reviews, which are frequently expressed in code-mixed formats [19, 20, 21]. In a multilingual
population, code-mixing is a common occurrence. People in a multilingual country like India
generally use code-mixed discourse in both physical and online settings. Code-mixed texts
are frequently written in a single script and combine two languages (for example, Hindi and
English). Code-mixing is demonstrated in the following sentence: “Maine aaj tak itna swadist
bhojan nahi kiya.” It is a mixed-code Hindi-English sentence. Even though it is written in
English script, many individuals who do not speak Hindi will be unable to comprehend the
sentiment/meaning of this line. Identifying sentiments in multilingual settings is tough even
for machines that have been trained for monolingual situations.</p>
      <p>The current study analyzes the emotion of YouTube comments in code-mixed Dravidian
languages like “Tamil-English ”, “Malayalam-English,” and “Kannada-English”. The dataset used in
this work is a part of the task proposed in “Dravidian-CodeMix-FIRE 2021 task”. The current
paper develops various classifiers to perform message-level classification of each YouTube comment
into one of the following classes that are “Positive”, “negative”, “not-tamil/malayalam/kannada”,
“unknown_state” and “mixed-feelings”. The current paper examines the robustness of several
conventional machine learning models (such as support vector machine, random forest, and
decision tree ) and deep learning models in serial settings (such as convolutional neural network
(CNN), long short term memory(LSTM), bidirectional-long short term memory (Bi-LSTM), and
gated recurrent unit networks (GRU)) and parallel settings (two parallel CNNs, or two parallel
LSTMs, two parallel CNN and LSTM, three parallel LSTMs) for the sentiment analysis task. The
experimental results on a parallel hybrid model comprising LSTM networks reported the best
accuracy of 0.56 F1-score.</p>
      <p>The organization of the paper is as follows: the methodology proposed in the paper and
dataset statistics are described in Section 2. The results of various experiments are explained in
Section 3. The current paper summarizes the most important findings of the paper in Section 4.
Tamil
Malayalam
Kannada
Train
20070
6421
2823</p>
    </sec>
    <sec id="sec-2">
      <title>2. Methodology</title>
      <p>The current research proposes a multi-task classification model for sentiment analysis of
Dravidian code-mixed YouTube comments. Figure 1 depicts a thorough flow of the proposed
methods. Three parallel layers of LSTM networks are concatenated to extract features from
YouTube comments, as shown in Figure 1. The output layer receives the concatenated vector as
input for multi-task classification.</p>
      <sec id="sec-2-1">
        <title>2.1. Task and Data Description</title>
        <p>The task aimed to classify each YouTube comment into one of the five classes that are “Positive”,
“negative”, “not-tamil/malayalam/kannada”, “unknown_state” and “mixed-feelings”.</p>
        <p>The competition datasets were made available in stages. Initially, training and development
data for each Tamil, Malayalam, and Kannada corpus were released separately, resulting in
six separate data files. Each file contained two fields: a text field and a category field. Except
for a few situations where it was more than one, the average comment length in a corpus of
Tamil, Malayalam, and Kannada was one. Table 1 contains descriptions of both the training and
development sets for all three corpora. The present system was trained on 35656, 15888, and
6212 samples for Tamil, Malayalam, and Kannada, respectively, and validated on 3962, 1766,
and 691 samples for Tamil, Malayalam, and Kannada, as given in Table 1. The organizers later
provided the test dataset, based on which the final ranking of submitted models was determined.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Data Preprocessing</title>
        <p>Every language dataset (train, development, and test sets) was subjected to the preprocessing
steps. Initially, all punctuation was removed from the texts and they were changed to lowercase.
Certain rows have only one word with no apparent meaning, such as “Suppperrrrrrrrrrrrrrr”
that was eliminated. Sentences with two or fewer letters were removed because they had little
impact on the dataset. Finally, the training dataset was prepared. After that, the cleaned text
was tokenized and encoded into a series of token indexes. Finally, padding with a maximum
length of 100 was used to ensure that all texts were of similar length.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Classification models</title>
        <p>For sentiment analysis of YouTube comments, the current work used diferent classification
models. This section covers the many conventional classifiers, deep learning classifiers, and</p>
        <p>Negative
Mixed_feelings</p>
        <p>Not_tamil/malayalm/kannada</p>
        <p>Unknown_state
LSTM (256)
Word 2 Vec
Embedding</p>
        <p>.
100 X 500</p>
        <p>Dense Layer (64)
Concatenated Layer</p>
        <p>Bidirectional
LSTM (256)</p>
        <p>Random
Embedding</p>
        <p>.</p>
        <p>100 X 500
Pre-Processed Text Data</p>
        <p>LSTM (256)
Character
Embedding</p>
        <p>.
100 X 500
hybrid classifiers that were utilized in the study.</p>
        <sec id="sec-2-3-1">
          <title>2.3.1. Conventional Machine Learning Classifiers</title>
          <p>The Support Vector Machine (SVM), Logistic Regression (LR), and Random Forest (RF) are
three traditional machine learning-based models that were developed to categorize YouTube
comments. Tf-idf vectors constructed from Tamil, Malayalam, and Kannada comments were
used as input to these classifiers. Initially, the library WhiteSpace Tokenizer was used to tokenize
the comments. Further, the Porter stemmer library was used to stem the tokenized text. Finally,
a Tf-idf vectorizer was used to vectorize the stemmed text. The Tf-idf vector-based machine
learning models had extremely long training times.</p>
        </sec>
        <sec id="sec-2-3-2">
          <title>2.3.2. Deep learning classifier</title>
          <p>The current study used convolutional neural networks (CNN) and recurrent neural network
versions such as long short term memory (LSTM), Bidirectional long short term memory
(BiLSTM), and Gated recurrent units (GRU) in single and multiple layers in the category of deep
networks. The tokenized texts were first encoded and then padded with maximum comment
length 100, 60, and 50 for Tamil, Malayalam, and Kannada datasets, respectively.</p>
          <p>The input to the above deep models was a one-hot vector that represented each word in a
language in its vocabulary dimensions. For example, the vocabulary size in the Tamil corpus
was 55810, therefore each word in the Tamil data was represented by a vector with 55810
dimensions (1×55810). The one-hot input representation was a high-dimensional sparse vector
with a single 1’ (representing the token’s index) and all zeros. This high dimensional space
vector was reduced into low dimensional dense value vector before passing it as an input to the
deep models. An embedding layer that represented every word in a 500 dimensional (1×500)
dense vector was utilized to turn one hot vector into a low dimensional dense valued vector.</p>
          <p>While representing one-hot vector into embedded dense vector several pre-trained weights
from Word2Vec and random embeddings were used. The obtained embedded vectors were fed
to the deep networks (such as single and multiple layers of CNN, LSTM, GRU, and Bi-LSTM)
for further steps like contextual feature extraction followed by classification.</p>
        </sec>
        <sec id="sec-2-3-3">
          <title>2.3.3. Hybrid Network</title>
          <p>Several hybrid deep neural network configurations, such as parallel CNN-CNN, CNN-LSTM,
CNN-BiLSTM, CNN-CNN-CNN, CNN-LSTM-BiLSTM, CNN-CNN-GRU, CNN-LSTM-BiLSTM,
and so on, have also been developed for sentiment analysis tasks. Figure 1 shows the architecture
of the best hybrid model. The embedding vectors were used as the input to these networks, as
indicated in Section 2.3.2.</p>
          <p>For all of the datasets, the model shown in Figure 1 reported the best accuracy. The number of
words fed from the pre-processed text for Tamil, Malayalam, and Kannada, respectively, was 100,
60, and 50. Every word in Tamil (with a dimension of 1×55810), Malayalam (with a dimension
of 1×34012), Kannada (with a dimension of 1×13015), and every character in Tamil (with a
dimension of 1×358 ), Malayalam (with a dimension of 1×230 ), and Kannada (with a dimension
of 1×222 ) was represented in 500 dimensional embedded vector (1×500) using word2vec, random
embedding, and character embeddings. The parallel Bidirectional long short-term memory
(BiLSTM) models were fed the embedded vectors. The retrieved features (or output) from the
three parallel layers were then combined into a single vector. The concatenated vector was
then fed to two serial dense layers having neurons 64 and 5 respectively. The last dense layer
(having 5 neurons shown as output labels in Figure 1) is used to classify each comment into
one of five categories of sentiment. For all datasets, the model was trained for “50” epoch using
batch size “64” with the “Adam” optimizer and “categorical cross entropy” loss function.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Results</title>
      <p>As discussed above several classification models were built and experimented with the given
Dravidian codemixed datasets. We are reporting the results of the submitted models in the
competition and a few other comparable models. All models were built in Python using libraries
Sklearn, Keras, Pandas, and Numpy. The metric used to evaluate the performance of a model
was the weighted F1-score (or weighted F1). Table 2 shows the experimental results of various
machine learning, deep learning, and hybrid learning models on the development dataset. In
conventional classifiers, SVM, Decision tree and Random forest performances are shown in
Table 2. The feature used for conventional classifiers was Tf-idf vector whereas, for deep and
hybrid models, features were embeddings.
Tamil</p>
      <p>Conventional
Learning models
Deep learning
model
Malayalam
Kannada</p>
      <p>Hybrid learning
models
Conventional learning
models
Deep Learning
models
Hybrid Models
conventional learning
model
Deep Learning
Model
Hybrid Learning
Model</p>
      <p>In conventional models, for the Tamil dataset, SVM reported weighted F1 0.45, Random Forest
reported weighted F1 0.46 and Decision tree reported weighted F1 0.45. The performance of the
traditional classifier was poor as compared to deep learning classifiers as it is shown in Table 2.
The LSTM and Bi-LSTM models with word2vec embedding reported 0.51 and 0.52 weighted F1
scores, but at the same time, the performance of the CNN model with word2vec embedding
was very low as weighted F1 was 0.42. The best result-reported were from the hybrid model
3-parallel Bi-LSTM with word2vec and random word embedding and random char embeddings
where weighted F1 reported was 0.55. The performance of 3-parallel LSTM,and CNN Bi-LSTM
was reported as 0.53 weighted F1.</p>
      <p>For the Malayalam dataset, in conventional models, the performance of Random forest
(weighted F1 0.59) was better than SVM (weighted F1 0.56) and Decision tree(weighted F1 0.57).
The best performing model was 3-parallel Bi-LSTM with a random word, random character,
and word2vec embeddings with weighted F1 0.63. Similarly, For the Kannada dataset, in
conventional models, the performance of random forest (weighted F1 0.50) was better than
the decision tree (weighted F1 0.47) and SVM (weighted F1 0.46). The best performing model
was 3-parallel Bi-LSTM with word2vec, random word, and random character embeddings with
weighted F1 0.56.</p>
      <p>The best models from the development data evaluation were then submitted into the
competition and evaluated by the organizers. On the test data provided by the organizers, they
evaluate each model submitted by all participating teams against each task. They provided the
data without a label, and the final ranking for all submitted models was published based on
our submitted model. The test dataset weighted F1 score produced by the associated model
against each task is shown in Table 3. As can be seen in Table Table 3, for Tamil, Malayalam,
and Kannada datasets we secured 15th, 12th, and 12th rank among all other submitted models.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion</title>
      <p>The current paper identified the sentiment of Dravidian code-mixed YouTube comments written
in Tamil, Malayalam, and Kannada. Every YouTube comment was categorized in one of the 5
categories “Positive”, “negative”, “not-tamil/malayalam/kannada”, “unknown_state” and
“mixedfeelings”. In all of the three datasets, the best performing model was 3-parallel LSTM model
with Word2vec embedding, random word, and random char embeddings. The model reported
weighted F1 0.55,0.63 and 0.56 for development data. The model obtained 15th, 12th, and 12th
positions respectively for Tamil, Malayalam, and Kannada datasets.
and Computing for Under-Resourced Languages (CCURL), European Language Resources
association, Marseille, France, 2020, pp. 177–184. URL: https://aclanthology.org/2020.sltu-1.
25.
[15] B. R. Chakravarthi, V. Muralidaran, R. Priyadharshini, J. P. McCrae, Corpus
creation for sentiment analysis in code-mixed Tamil-English text, in: Proceedings of the
1st Joint Workshop on Spoken Language Technologies for Under-resourced languages
(SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL),
European Language Resources association, Marseille, France, 2020, pp. 202–210. URL:
https://aclanthology.org/2020.sltu-1.28.
[16] A. Hande, R. Priyadharshini, B. R. Chakravarthi, KanCMD: Kannada CodeMixed dataset
for sentiment analysis and ofensive language detection, in: Proceedings of the Third
Workshop on Computational Modeling of People’s Opinions, Personality, and Emotion’s
in Social Media, Association for Computational Linguistics, Barcelona, Spain (Online),
2020, pp. 54–63. URL: https://aclanthology.org/2020.peoples-1.6.
[17] R. Gatautis, The rise of the platforms: Business model innovation perspectives, Engineering</p>
      <p>Economics 28 (2017) 585–591.
[18] L. Yue, W. Chen, X. Li, W. Zuo, M. Yin, A survey of sentiment analysis in social media,</p>
      <p>Knowledge and Information Systems 60 (2019) 617–663.
[19] A. Joshi, A. Prabhu, M. Shrivastava, V. Varma, Towards sub-word level compositions for
sentiment analysis of hindi-english code mixed text, in: Proceedings of COLING 2016, the
26th International Conference on Computational Linguistics: Technical Papers, 2016, pp.
2482–2491.
[20] A. Kumar, S. Saumya, J. P. Singh, NITP-AI-NLP@ Dravidian-CodeMix-FIRE2020: A Hybrid
CNN and Bi-LSTM Network for Sentiment Analysis of Dravidian Code-Mixed Social Media
Posts., in: FIRE (Working Notes), 2020, pp. 582–590.
[21] B. R. Chakravarthi, N. Jose, S. Suryawanshi, E. Sherly, J. P. McCrae, A sentiment analysis
dataset for code-mixed malayalam-english, arXiv preprint arXiv:2006.00210 (2020).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Saumya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <article-title>Detection of spam reviews: A sentiment analysis approach</article-title>
          ,
          <source>Csi Transactions on ICT 6</source>
          (
          <year>2018</year>
          )
          <fpage>137</fpage>
          -
          <lpage>148</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>D. M. E.-D. M. Hussein</surname>
          </string-name>
          ,
          <article-title>A survey on sentiment analysis challenges</article-title>
          ,
          <source>Journal of King</source>
          Saud University-Engineering Sciences
          <volume>30</volume>
          (
          <year>2018</year>
          )
          <fpage>330</fpage>
          -
          <lpage>338</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <string-name>
            <surname>C. Zhang,</surname>
          </string-name>
          <article-title>Online shopping behavior study based on multi-granularity opinion mining: China versus america</article-title>
          ,
          <source>Cognitive Computation 8</source>
          (
          <year>2016</year>
          )
          <fpage>587</fpage>
          -
          <lpage>602</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Saumya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. K.</given-names>
            <surname>Dwivedi</surname>
          </string-name>
          ,
          <article-title>Predicting the helpfulness score of online reviews using convolutional neural network</article-title>
          , Soft
          <string-name>
            <surname>Computing</surname>
          </string-name>
          (
          <year>2019</year>
          ,https://doi.org/10.1007/s00500- 019-03851-
          <issue>5</issue>
          )
          <fpage>1</fpage>
          -
          <lpage>17</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <article-title>Social interaction-based consumer decision-making model in social commerce: The role of word of mouth and observational learning</article-title>
          ,
          <source>International Journal of Information Management</source>
          <volume>37</volume>
          (
          <year>2017</year>
          )
          <fpage>179</fpage>
          -
          <lpage>189</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>K.</given-names>
            <surname>Kenyon-Dean</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Ahmed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Fujimoto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Georges-Filteau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Glasz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Kaur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lalande</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bhanderi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Belfer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kanagasabai</surname>
          </string-name>
          , et al.,
          <article-title>Sentiment analysis: It's complicated!</article-title>
          ,
          <source>in: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>1</volume>
          (
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          ,
          <year>2018</year>
          , pp.
          <fpage>1886</fpage>
          -
          <lpage>1895</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Saumya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Singh</surname>
          </string-name>
          , et al.,
          <article-title>Spam review detection using lstm autoencoder: an unsupervised approach</article-title>
          ,
          <source>Electronic Commerce Research</source>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>21</lpage>
          , https://doi.org/10.1007/s10660-020-09413-4.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>K.</given-names>
            <surname>Ravi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ravi</surname>
          </string-name>
          ,
          <article-title>A survey on opinion mining and sentiment analysis: tasks, approaches and applications</article-title>
          ,
          <source>Knowledge-based systems 89</source>
          (
          <year>2015</year>
          )
          <fpage>14</fpage>
          -
          <lpage>46</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Priyadharshini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Muralidaran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Suryawanshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Jose</surname>
          </string-name>
          , E. Sherly,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>McCrae</surname>
          </string-name>
          ,
          <article-title>Overview of the track on sentiment analysis for dravidian languages in code-mixed text</article-title>
          ,
          <source>in: Forum for Information Retrieval Evaluation</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>21</fpage>
          -
          <lpage>24</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S.</given-names>
            <surname>Banerjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jayapal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Thavareesan</surname>
          </string-name>
          ,
          <article-title>Nuig-shubhanker@dravidian-codemix- fire2020: Sentiment analysis of code-mixed dravidian text using xlnet</article-title>
          ,
          <source>in: FIRE</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>S.</given-names>
            <surname>Suryawanshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <article-title>Findings of the shared task on troll meme classification in Tamil</article-title>
          ,
          <source>in: Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, Association for Computational Linguistics</source>
          , Kyiv,
          <year>2021</year>
          , pp.
          <fpage>126</fpage>
          -
          <lpage>132</lpage>
          . URL: https://aclanthology.org/
          <year>2021</year>
          .dravidianlangtech-
          <volume>1</volume>
          .
          <fpage>16</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>R.</given-names>
            <surname>Priyadharshini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Thavareesan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chinnappa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Durairaj</surname>
          </string-name>
          , E. Sherly,
          <article-title>Overview of the dravidiancodemix 2021 shared task on sentiment detection in tamil, malayalam, and kannada, in: Forum for Information Retrieval Evaluation</article-title>
          ,
          <string-name>
            <surname>FIRE</surname>
          </string-name>
          <year>2021</year>
          ,
          <article-title>Association for Computing Machinery</article-title>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Priyadharshini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Thavareesan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chinnappa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Durairaj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Sherly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>McCrae</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hande</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ponnusamy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Banerjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Vasantharajan</surname>
          </string-name>
          ,
          <source>Findings of the Sentiment Analysis of Dravidian Languages in Code-Mixed Text</source>
          <year>2021</year>
          , in: Working Notes of FIRE 2021 -
          <article-title>Forum for Information Retrieval Evaluation</article-title>
          ,
          <string-name>
            <surname>CEUR</surname>
          </string-name>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          , N. Jose,
          <string-name>
            <given-names>S.</given-names>
            <surname>Suryawanshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Sherly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>McCrae</surname>
          </string-name>
          ,
          <article-title>A sentiment analysis dataset for code-mixed Malayalam-English, in: Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>