<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>J. Varsha); bharathib@ssn.edu.in (B. Bharathi); meenaksa@srmist.edu.in
(A. Meenakshi)
 https://www.ssn.edu.in/staf-members/dr-b-bharathi/ (B. Bharathi)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1080/10911359.2014.995392</article-id>
      <title-group>
        <article-title>Sentiment Analysis and Homophobia detection of YouTube comments in Code-Mixed Dravidian Languages using machine learning and Transformer models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Josephine Varsha</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>B Bharathi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>A. Meenakshi</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of CSE, Sri Siva Subramaniya Nadar College of Engineering</institution>
          ,
          <addr-line>Tamil Nadu</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Computer Science and Application, SRM Institute of Science and Technology</institution>
          ,
          <addr-line>Tamil Nadu</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0001</lpage>
      <abstract>
        <p>Sentiment Analysis is the task of identifying the emotions underlying the subjective opinions or emotional responses pertaining to a given topic, be it positive, negative or neutral. Sentiment Analysis is done with the use of natural language processing. Homophobia speech is a type of hate speech directed towards LGBT+ people. This research work presents Sentiment Analysis and Homophobia detection in Youtube comments in Code-Mixed Dravidian Languages with diferent embeddings using machine learning algorithms. The goal of Task- A is to identify sentiment polarity of the code-mixed dataset of comments, posts in Tamil-English, Malayalam-English, and Kannada-English collected from social media. The goal of Task-B is to identify if the comment is homophobic/transphobic in nature. Our team srmnlp worked with code-mixed form of Tamil, Malayalam and Kannada text provided by the FIRE 2022 organizers. Pre-trained models such as bert, xlm, MPNet were used along with classifiers such as SVM,MLP, Random Forest under the feature extraction techniques like, Count Vectorizer, and TF-IDF. The rankings for sentiment analysis task are, rank 1 in Tamil dataset, rank 6 in Malayalam dataset, rank 7 in Kannada dataset. The highest F1-score of 0.63 was obtained for sentiment analysis in Malayalam dataset, similarly 0.95 was obtained for homophibia detection task in Malayalam dataset. The performance of the proposed system is compared with various machine learning algorithms.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Count Vectorizer</kwd>
        <kwd>TF-IDF</kwd>
        <kwd>Random Forest</kwd>
        <kwd>Adaboost</kwd>
        <kwd>Xlnet</kwd>
        <kwd>Sentiment</kwd>
        <kwd>Homophobia</kwd>
        <kwd>Transphobia</kwd>
        <kwd>LGBT+</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>A person’s or a group’s reputation can change significantly thanks to a large part of social
media. Social media plays a major role in online communication, facilitating users to freely
post and share material and express their opinions and thoughts on anything at any time. With
the freedom of speech prevailing on social media, there are a few voices that send intentional
hate messages toward LGBT+ people. People who identify as LGBT+ are routinely mistreated,
treated unfairly, tortured, and even executed around the world because of the way they appear,
the people they love, and who they are. Even with the increasing awareness of the importance
of media representation, the amount of general toxicity on the internet remains unchanged
McInroy and Craig [1].</p>
      <p>The usage of social media has increased dramatically in recent years, yet there are fundamental
standards of behaviour that limit free expression in order to preserve a positive environment
and prevent online abuse which was mentioned by Chakravarthi et al. [2]. Utilizing the special
features of the Internet, such as anonymity, the user is able to have a big impact on other
people’s lives. Unfortunately, homophobic or transphobic attacks also target LGBT+ individuals
who seek consolation online. Because of this, LGBT+ individuals seeking support online are
assaulted or mistreated, which has a serious impact on their mental healthMcConnell et al. [3].</p>
      <p>Sentiment analysis is the process of determining the sentiments like emotions, that may or
may not afect others, in the given text or sentence, or paragraph Kalaivani and Thenmozhi
[4]. A text mining task such as Sentiment Analysis helps companies and researchers to extract
personal information from source material, to try to understand the social sentiment of a brand
or service better Hande et al. [5]. There are many monolingual datasets for Dravidian languages
that can be utilised for various types of study in order to discover and extract the emotions
from a text.</p>
      <p>Sentiment multilingual code-mixed language is an important challenge research area in
sentiment analysis research. The organizers proposed the shared task of
Dravidian–CodeMixFire to classify the sentiment polarity as positive, negative, neutral, mixed emotions, unknown
state or not in the Tamil-English and Malayalam-English languages. Based on the sentiment
analysis, the task is to detect the mixed feelings of online users and to prevent unusual activities,
depression, and criminal activities.</p>
      <p>To make it simpler to ban non-LGBT+ content and drive the Internet toward equality, diversity,
and inclusion, it is urgently necessary to find and filter homophobic and transphobic materials
online. It is also crucial to evaluate and discern the sentiment of a text. Our team worked on the
shared task of Homophobia detection and Sentiment Analysis of Dravidian-CodeMix-Fire 2022.
For this proposed work, we used the datasets for code-mixed Tamil, code-mixed Malayalam,
code- mixed Kannada text for the sentiment analysis task, comprising comments from YouTube.
Similarly, we used 3 datasets, Tamil, Malayalam, and code-mixed Tamil text for the Homophobia
Detection which also comprised comments from YouTube. We have used feature extraction
methods namely, Count Vectorizer and TF-IDF to extract the feature vector, and classifier
models like SVM, MLP, AdaBoost, and transformer models like Bert, XLM. In this study, we
examine the efectiveness of various learning models in identifying homophobia and sentiment
analysis, under the working notes of FIRE 2022, in the proceedings of dravidian code-mix 2022
Shanmugavadivel et al. [6].</p>
      <p>There are five sections in the paper. Section 2, the related works on sentiment analysis and
homophobia detection for Tamil and other languages, in the field of artificial intelligence. In
Section 3 of this study, the methodology suggested for the model and the techniques used are
thoroughly detailed. Results and observations are discussed in section 4. The paper is concluded
in Section 5.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>To recognize the emotions included in the text sequences, The Seq2Seq deep neural network
was used by the authors of Thenmozhi et al. [7] to build various models by adjusting settings
like the number of layers, units, and attention wrappers, with and without delimiter string,
and train-validation split. Research on a development set shows that the two-layered LSTM
with the Normed Bahdanau attention mechanism, delimiter string, and train-validation split
outperforms all alternatives.</p>
      <p>To enhance study in the under-resourced Dravidian languages, Chakravarthi et al. [8] created
an annotation scheme and obtained a high level of inter-annotator agreement in terms of
Krippendorf from volunteer annotators on contributions gathered using Google Form. For
each class, baselines with gold standard annotations for recall, precision, and F-Score were
constructed and presented.</p>
      <p>To classify the emotions in Tamil, Sampath et al. [9] produced two additional datasets with
ifne-grained emotions. They plan to increase the dataset’s size and include more exact emotional
descriptors in order to boost the system’s eficacy.</p>
      <p>A method for performing sentiment analysis in Tamil texts using the k-means clustering and
k-nearest neighbor algorithm was proposed in Thavareesan and Mahesan [10]. For all k values,
class-wise clustering with m-folds of the training set beat the alternative strategies and the
baseline method. This method’s ability to produce improved accuracy with fewer k-nearest
neighbor classifier training samples is another enhancement.</p>
      <p>Chakravarthi et al. [11] performed the first joint efort for classifying YouTube comments
using the Tamil, English, and Tamil-English (code-mixed) dataset to detect homophobia and
transphobia. To deal with data imbalance and multilingualism, the most efective solution used
XLM, RoBERTa pre-trained language models for zero-shot learning.</p>
      <p>The first dataset on homophobia and transphobia in multilingual comments in Tamil, English,
and Tamil-English was produced by Chakravarthi et al. [2]. This study ofered a dataset with
high-quality, expert homophobic and transphobic content classification from multilingual
YouTube comments as well as a hierarchical granular homophobic and transphobic taxonomy.</p>
      <p>The authors of S et al. [12], a synthesis of two knowledge bases of words and emojis, made
use of an Emotion Word Ontology. A list of emotional terms that are matched to the appropriate
emotion class can be found in the Word Knowledge Base. Similar to this, the Emoji Knowledge
Base includes emotion icons that correspond to the associated emotions.</p>
      <p>A model that initially learns to extract the sub-elements (holders, targets, and expressions)
using sequence labelers was ofered by Anantharaman et al. [13] in their article.</p>
      <p>The authors of [14] [15] [16], uses machine learning algorithms and transformer models for
sentiment analysis and homophobia detection tasks.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Proposed approach</title>
      <p>In this section, we have described our implementation of feature extraction and machine
learning algorithms. Further, we will evaluate the performance of the various algorithms we’ve
employed along with the feature extraction procedure. The architecture of the proposed model
is illustrated down below along with the steps involved in Fig. 1 and Fig. 2</p>
      <p>The datasets provided by the FIRE 2022 organizers for the Sentiment Analysis, and
Homophobic Detection [2] consisted of code-mixed text in Tamil, Malayalam, and Kannada each
consisting of Youtube comments. The details regarding the datasets are provided in Table 1 and
Table 2.</p>
      <sec id="sec-3-1">
        <title>3.1. Data-set Analysis</title>
        <p>Determine whether a particular comment has an emotion and the sentiment it represents is
the aim of the first job. This assignment involves polarity categorization at the message level.
Systems must categorize a YouTube comment into positive, negative, neutral, or mixed emotions.</p>
        <p>The dataset issued by the FIRE 2022 organizers, consists of the 3 datasets, namely, training set,
development set, and test set, each consisting of 15889, 1767, and 1963 instances respectively
for the Malayalam code-mixed text, 35657, 3963, and 650 instances respectively for the Tamil
code-mixed text, and 6213, 692, and 769 instances respectively for the Kannada code-mixed text.
It contained the sequence of texts that include user utterances along with the context, followed
by the sentiment class label. The task was to identify and label them under any of the following:
Positive, Negative, Mixed Feelings, Unknown State, Not in Language</p>
        <p>The goal of the second task is to check whether a specific comment contains homophobic,
or transphobic speech and if not those comments should be labeled, Non-LGBTQ+. We were
provided with comments extracted from social media platforms and developed submit systems
to predict whether it is homophobic/transphobic in nature. The seed data for this task is the
Homophobia/Transphobia Detection dataset, a collection of comments from YouTube. This
dataset consists of manually annotated comments indicating whether the text is
homophobic/transphobic or not.</p>
        <p>The dataset provided by FIRE 2022 organizers, consisted of the training set, development set,
and test set of 2663, 667, and 650 instances respectively for the Tamil text, 3115, 867, and 1214
instances respectively for the Malayalam text, and 3862, 967, and 1208 instances respectively
for the Tamil-English code-mixed text. It contained text sequences that include user utterances
along with the context, followed by the homophobic, or transphobic class label. The task was
to identify and label them under any of the following: Homophobia, Transphobia, Non-LGBT
content.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Data Pre-processing</title>
        <p>Since real-world data frequently contains noise, and missing values, and may be in an
unusable format that cannot be directly used for machine learning models, data pre-processing is
crucial for any machine learning challenge. Data preprocessing is necessary to clean the data
and prepare it for a machine learning model, which also improves the model’s accuracy and
efectiveness. Before categorising, the dataset must first be cleaned and processed.</p>
        <p>The Natural Language Toolkit, often known as the NLTK package, which was created to
interact with the NLP, has been used to implement data processing (Natural Language Processing).
Diferent text-processing libraries are provided for categorization, tokenization, parsing,
semantic reasoning, etc. Functions were used to clean and scrape the text, remove URLs, numerals,
and tags.</p>
        <p>We were able to extract the tokens from the string using the RegexpTokenizer() function and
the tokenize. regexp() module. Tokenizing is an essential step When it comes to cleaning the
text. It is employed to divide the text into words or sentences, dividing it into more manageable
chunks while maintaining its meaning.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Methodology</title>
        <p>The datasets were used with machine learning models that had various embeddings, namely
TF-IDF, count vectorizer, BERT. Classifiers namely Random Forest, Support Vector Machine,
and Multilayer Perceptron were used to build the baseline models with the above embeddings.
After removing the essential features from the processed data, these classifier models were used.
These models were fine-tuned after being trained on the training dataset using the development
set. By speculating on the labels for the held-out test set, the model’s efectiveness was assessed.
The models considered and their eficiency are mentioned in table 3 and 4 for the Tamil dataset,
5 and 6 for the Malayalam dataset, and 7 and 8 for the Kannada dataset, where the performance
metrics of all the datasets in the sentiment analysis task have been tabulated.</p>
        <p>A similar approach was chosen for the Homophobia Detection task. The same feature
extraction techniques were employed to decrease the number of features in the input, along
with the classifier models mentioned above for vector classification. The performance of the
dataset has been tabulated below to measure the eficiency. The models under consideration and
the performance along with the development dataset have been tabulated in 9 and 10 for the
Tamil dataset, 11 and 12 for the Malayalam dataset. Similarly, 13 and 14 show the tabulations
for the code-mixed Tamil dataset.</p>
        <p>Feature Extraction aims to reduce the number of features in the datasets by creating new
features from the existing ones and then discarding the original features. The feature extraction
methods we followed are
• Count Vectorizer:</p>
        <p>The Count Vectorizer feature extractor breaks down a sentence or any text into smaller
words by performing preprocessing tasks. This approach converts text into a vector form
that is dependent on the frequency in which each word occurs in the text.
• TF-IDF:</p>
        <p>TF-IDF stands for term frequency-inverse document frequency. For each word in the</p>
        <p>Count Vectorizer
Count Vectorizer
corpus relative to the dataset, the TF-IDF score is calculated, and the data is then put into
a vector. Each document in the corpus would have its own vector, and each word in the
entire collection of documents would have a TF-IDF score in the vector. Typically, this is
employed in fields like text mining and information retrieval.</p>
        <p>In our proposed approach, TF-IDF, Count vectorizer, and BERT embeddings were extracted
from the dataset. Then the extracted features were trained with multiple machine learning
models such as SVM classifier, MLP classifier, random forest classifier, Ada boost classifier,
Gradient Boosting classifier, and ExtraTrees classifier. The experiments were conducted
for Tamil code-mixed, Malayalam code-mixed, and Kannada code-mixed data sets for
the Sentiment Analysis task, and Tamil, Malayalam, and Tamil-English data sets for the
Homophobia Detection task, and the models that obtained the best results were used to
generate the scores for the test dataset.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results and Analysis</title>
      <p>In this section, we will discuss the performance of the techniques, and models implemented
and chose the best accurate model that will generate the test labels.</p>
      <p>Count Vectorizer
Count Vectorizer</p>
      <sec id="sec-4-1">
        <title>4.1. Sentiment Analysis</title>
        <sec id="sec-4-1-1">
          <title>4.1.1. TamilEnglish Dataset</title>
          <p>Feature Extraction techniques namely, Count vectorizer, TFIDF, and BERT embeddings were
employed to extract the necessary features from the youtube comments specified in
TamilEnglish code mixed text. These extracted features in the form of vectors were trained along
with diferent machine learning algorithms and then evaluated using the development data.
The results are tabulated down below, in Table 3 and Table 4. From the table, we see that Count
Vectorizer with the Random Forest model fetched the best F1-score of 0.61.</p>
        </sec>
        <sec id="sec-4-1-2">
          <title>4.1.2. MalayalamEnglish Dataset</title>
          <p>Feature Extraction techniques namely, Count vectorizer, TFIDF, and BERT embeddings were
employed to extract the necessary features from the youtube comments specified in
MalayalamEnglish code mixed text. These extracted features in the form of vectors were trained along
with diferent machine learning algorithms and then evaluated using the development data.
The results are tabulated in Table 5 and Table 6. From the table, we see that Count Vectorizer
with the MLP classifier fetched the best F1-score of 0.63.</p>
          <p>Count Vectorizer
Count Vectorizer</p>
        </sec>
        <sec id="sec-4-1-3">
          <title>4.1.3. KannadaEnglish Dataset</title>
          <p>Feature Extraction techniques namely, Count vectorizer, TFIDF, and BERT embeddings were
employed to extract the necessary features from the youtube comments specified in
KannadaEnglish code mixed text. These extracted features in the form of vectors were trained along
with diferent machine learning algorithms and then evaluated using the development data.
The results are tabulated in Table 7 and Table 8. From the table, we see that Count Vectorizer
with the Random Forest model fetched the best F1-score of 0.61.</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Homophobia Detection</title>
        <sec id="sec-4-2-1">
          <title>4.2.1. Tamil Dataset</title>
          <p>Feature Extraction techniques namely, Count vectorizer, TFIDF, and BERT embeddings were
employed to extract the necessary features from the youtube comments specified in Tamil
text. These extracted features in the form of vectors were trained along with diferent machine
learning algorithms and then evaluated using the development data. The results are tabulated
in Table 9 and Table 10. From the table, we see that TF-IDF with the Random Forest model
fetched the best F1-score of 0.88.</p>
          <p>Count Vectorizer</p>
        </sec>
        <sec id="sec-4-2-2">
          <title>4.2.2. Malayalam Dataset</title>
          <p>Feature Extraction techniques namely, Count vectorizer, TFIDF, and BERT embeddings were
employed to extract the necessary features from the youtube comments specified in Malayalam
text. These extracted features in the form of vectors were trained along with diferent machine
learning algorithms and then evaluated using the development data. The results are tabulated
in Table 11 and Table 12. From the table, we see that Count Vectorizer and TF-IDF with the
Random Forest model fetched the best F1-score of 0.96 along with the ExtraTrees classifier.</p>
        </sec>
        <sec id="sec-4-2-3">
          <title>4.2.3. Tamil-English Dataset</title>
          <p>Feature Extraction techniques namely, Count vectorizer, TFIDF, and BERT embeddings were
employed to extract the necessary features from the youtube comments specified in
TamilEnglish code mixed text. These extracted features in the form of vectors were trained along
with diferent machine learning algorithms and then evaluated using the development data.
The results are tabulated in Table 13 and Table 14. From the table, we see that Count Vectorizer
with the SVM, MLP, and TF-IDF with MLP fetched the best F1-score of 0.86.</p>
          <p>The development dataset was used for evaluating the performance of the models after training
them. The final performance results for the task are recorded in Table 15 and 16.
0.85</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>Performance analysis of the proposed methodology using test data for Homophobia Detection
In this study, we examined the test datasets’ baseline accuracy for various models and model
variants. The challenge at hand was to determine whether a comment has sentiment and
whether it disparages LGBT+ individuals in any way. Sentiment analysis and homophobia
detection are in high demand on social media. Our team submitted these findings after taking
part in the FIRE 2022 competition. For both tasks, our models performed at a baseline, although
performance can be increased by incorporating beneficial features.
[1] L. McInroy, S. Craig, Transgender representation in ofline and online media: Lgbtq
youth perspectives, Journal of Human Behavior in the Social Environment 25 (2015) 1–12.
[2] B. R. Chakravarthi, R. Priyadharshini, R. Ponnusamy, P. K. Kumaresan, K. Sampath,
D. Thenmozhi, S. Thangasamy, R. Nallathambi, J. P. McCrae, Dataset for
identification of homophobia and transophobia in multilingual youtube comments, 2021. URL:
https://arxiv.org/abs/2109.00227. doi:10.48550/ARXIV.2109.00227.
[3] E. McConnell, A. Cliford, A. Korpak, G. Phillips, M. Birkett, Identity, victimization,
and support: Facebook experiences and mental health among lgbtq youth, Computers
in Human Behavior 76 (2017) 237–244. doi:10.1016/j.chb.2017.07.026, publisher
Copyright: © 2017 Copyright: Copyright 2017 Elsevier B.V., All rights reserved.
[4] A. Kalaivani, D. Thenmozhi, Ssn_nlp_mlrg@dravidian-codemix-fire2020: Sentiment
codemixed text classification in tamil and malayalam using ulmfit, in: FIRE, 2020.
[5] A. Hande, S. U. Hegde, R. Priyadharshini, R. Ponnusamy, P. K. Kumaresan, S. Thavareesan,
B. R. Chakravarthi, Benchmarking multi-task learning for sentiment analysis and ofensive
language identification in under-resourced dravidian languages, CoRR abs/2108.03867
(2021). URL: https://arxiv.org/abs/2108.03867. arXiv:2108.03867.
[6] K. Shanmugavadivel, M. Subramanian, P. K. Kumaresan, B. R. Chakravarthi, B. B, S.
Chinnaudayar Navaneethakrishnan, L. S.K, T. Mandl, R. Ponnusamy, V. Palanikumar, M. B.
J, Overview of the Shared Task on Sentiment Analysis and Homophobia Detection of
YouTube Comments in Code-Mixed Dravidian Languages, in: Working Notes of FIRE 2022
- Forum for Information Retrieval Evaluation, CEUR, 2022.
[7] D. Thenmozhi, A. Chandrabose, S. Sharavanan, et al., Ssn_nlp at semeval-2019 task 3:
Contextual emotion identification from textual conversation using seq2seq deep neural
network, in: Proceedings of the 13th International Workshop on Semantic Evaluation,
2019, pp. 318–323.
[8] B. R. Chakravarthi, R. Priyadharshini, V. Muralidaran, N. Jose, S. Suryawanshi, E. Sherly,
J. P. McCrae, DravidianCodeMix: sentiment analysis and ofensive language
identification dataset for Dravidian languages in code-mixed text, Language Resources
and Evaluation (2022). URL: https://doi.org/10.1007/s10579-022-09583-7. doi:10.1007/
s10579-022-09583-7.
[9] A. Sampath, T. Durairaj, B. R. Chakravarthi, R. Priyadharshini, S. Chinnaudayar
Navaneethakrishnan, K. Shanmugavadivel, S. Thavareesan, S. Thangasamy, P. Krishnamurthy,
A. Hande, S. Benhur, S. Ponnusamy, Kishor Kumar Pandiyan, Findings of the shared
task on Emotion Analysis in Tamil, in: Proceedings of the Second Workshop on Speech
and Language Technologies for Dravidian Languages, Association for Computational
Linguistics, 2022.
[10] S. Thavareesan, S. Mahesan, Sentiment analysis in tamil texts using k-means and k-nearest
neighbour, in: 2021 10th International Conference on Information and Automation for
Sustainability (ICIAfS), 2021, pp. 48–53. doi:10.1109/ICIAfS52090.2021.9605839.
[11] B. R. Chakravarthi, R. Priyadharshini, T. Durairaj, J. McCrae, P. Buitelaar, P. Kumaresan,
R. Ponnusamy, Overview of the shared task on homophobia and transphobia detection in
social media comments, in: Proceedings of the Second Workshop on Language Technology
for Equality, Diversity and Inclusion, Association for Computational Linguistics, Dublin,
Ireland, 2022, pp. 369–377. URL: https://aclanthology.org/2022.ltedi-1.57. doi:10.18653/
v1/2022.ltedi-1.57.
[12] V. S, K. Rajan, A. S, R. Sivanaiah, S. M. Rajendram, M. T T,
Varsini_and_Kirthanna@DravidianLangTech-ACL2022-emotional analysis in Tamil,
in: Proceedings of the Second Workshop on Speech and Language Technologies
for Dravidian Languages, Association for Computational Linguistics, Dublin,
Ireland, 2022, pp. 165–169. URL: https://aclanthology.org/2022.dravidianlangtech-1.26.
doi:10.18653/v1/2022.dravidianlangtech-1.26.
[13] K. Anantharaman, D. K, J. Pt, A. S, R. Sivanaiah, S. M. Rajendram, M. T T, SSN_MLRG1 at
SemEval-2022 task 10: Structured sentiment analysis using 2-layer BiLSTM, in: Proceedings
of the 16th International Workshop on Semantic Evaluation (SemEval-2022), Association
for Computational Linguistics, Seattle, United States, 2022, pp. 1324–1328. URL: https:
//aclanthology.org/2022.semeval-1.184. doi:10.18653/v1/2022.semeval-1.184.
[14] K. Swaminathan, B. Bharathi, G. Gayathri, H. Sampath, Ssncse_nlp@ lt-edi-acl2022:
Homophobia/transphobia detection in multiple languages using svm classifiers and
bertbased transformers, in: Proceedings of the Second Workshop on Language Technology
for Equality, Diversity and Inclusion, 2022, pp. 239–244.
[15] B. Bharathi, G. Samyuktha, Machine learning based approach for sentiment analysis on
multilingual code mixing text, in: Working Notes of FIRE 2021-Forum for Information
Retrieval Evaluation (Online). CEUR, 2021.
[16] N. N. A. Balaji, B. Bharathi, J. Bhuvana, Ssncse_nlp@ dravidian-codemix-fire2020:
Sentiment analysis for dravidian languages in code-mixed text., in: FIRE (Working Notes),
2020, pp. 554–559.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>