1. Introduction

Overview of Sarcasm Identification of Dravidian Languages in DravidianCodeMix@FIRE-2024

Sripriya N

sripriyan@ssn.edu.in 3

Bharathi B

bharathib@ssn.edu.in 3

Thenmozhi Durairaj

Nandhini K

Rahul Ponnusamy

Prasanna Kumar Kumaresan

Kishore Kumar Ponnusamy

kishorep161002@gmail.com 1

Charmathi Rajkumar

Bharathi Raja Chakravarthi

2 0 Central University of Tamil Nadu , Tamil Nadu , India 1 Digital University Kerala , Kerala , India 2 Insight SFI Research Centre for Data Analytics, University of Galway , Galway , Ireland 3 Sri Sivasubramaniya Nadar College of Engineering , Tamil Nadu , India 4 The American College , Tamil Nadu , India

Sarcasm is a linguistic expression that conveys an opposite meaning of what is actually stated in words. The growth of social media platforms like WhatsApp, Instagram, Twitter, and Facebook has led to the extensive use of sarcastic content among the public. Identifying sarcasm in such data has become highly critical due to its significance in related fields like sentiment analysis and emotion recognition . It could provide businesses and politicians with accurate insight, as it reflects the true opinion of people. In the recent times, we find that the comments or posts on social media are often found to be code-mixed. Detecting sarcasm in code-mixed text poses greater challenges in the field of NLP. This paper discusses the overview of the sarcasm identification in a shared task conducted as part of Dravidian-CodeMix @ FIRE-2024. The main goal of this task is to encourage researchers to develop systems to identify sarcasm in a dataset of code-mixed social media comments in Dravdian languages, particularly Tamil and Malayalam mixed with English. 23 teams participated in the shared task, with a focus on classifying comments as sarcastic or non-sarcastic. The paper provides a description of the models used by the teams in identifying sarcasm in code-mixed text. The performance of all systems developed by the teams was evaluated based on the macro-F1 score and the results are reported.

eol>Sarcasm Identification corpus Creation Classification Code-Mixing Dravidian Languages

1. Introduction

The communication in the entire world is changed by various social media platforms. People express their thoughts and feelings on social platforms like Twitter, WhatsApp that reflect their opinions about a particular topic, product, and news. Sarcasm is a linguistic expression that usually carries the contrary meaning of what is directly conveyed by words. Sarcasm majorly influences the tone and interpretation of discussions on these platforms, where it can be used to criticize political news, mock societal trends [1]. Sentiment analysis is an NLP task that performs contextual analysis of text identifying the subjectivity and sentiments present in opinions. Sarcasm expresses negative sentiments using positive words and this inherent ambiguity confuses sentiment analysis models. Misinterpretation of sarcastic remarks can lead to incorrect sentiment classification causing serious impacts on the overall analysis of public sentiment [2]. Sarcasm identification is essential in social media to promote positive communication and prevent conflicts. Identifying the sarcastic intent in text is critical when compared to the detection using visual cues like facial expression and body language. The complex interplay of linguistic, pragmatic and contextual factors makes the sarcasm identification task challenging [3].

Studies in literature [1] shows that sarcasm influences the comprehended sentiment of a post and also plays an important role in altering the public discourse. Sarcastic comments can go viral, amplifying their impact and altering the narrative of online conversations. Identifying sarcasm in textual content has several real time applications. In social media monitoring, sarcasm detection can help identify and mitigate harmful online behaviors such as cyberbullying and hate speech. In customer service, understanding sarcastic intent can improve customer satisfaction and response times. Additionally, sarcasm detection can be used to enhance the performance of chatbots and virtual assistants, allowing them to engage in more natural and nuanced conversations. Identifying Sarcasm in conversation remains a persistent challenge as researchers have to started focus on multimodal sarcasm detection [4] [5].

Researchers have explored various techniques for sarcasm detection, such as rule-based approaches, machine learning models, and deep learning frameworks. Early studies proposed linguistic-based approaches to sarcasm identification by analyzing patterns of irony in text [ 6]. More recent work by Ghosh et al. [7] and Jain et al. [8] employed machine learning models that leverage features like punctuation, word embeddings, and contextual information to improve sarcasm and ofensive content detection accuracy. While significant progress has been made in recent years, especially for English, the task of identifying sarcasm, sentiment and ofensive content remains particularly challenging for under-resourced languages like Dravidian [9][10][11]. The rise of multilingualism and code-mixing in online communication adds another layer of complexity to sarcasm detection. Code-mixed texts are sentences or conversations which comprise of words from multiple languages that are common in multilingual communities where speakers frequently switch between local languages and English. There is a surging requirement for sarcasm identification and sentiment detection among social media communication in Dravidian languages that are largely code-mixed [12] [13]. Detecting sarcasm in such code-mixed texts requires specialized models that account for language switching, cultural context, and nuanced expressions [14]. Research work in the literature shows that various challenges in detecting sarcasm in code-mixed languages, emphasizes the need for multilingual sarcasm identification models [15] [16].

Sarcasm identification is important for improving the overall performance of NLP systems. Identifying sarcastic content precisely help NLP models to improve sentiment analysis, enable better content moderation, and enhance the quality of conversational bots, result in more meaningful and contextaware interactions in social media environments. This crucial importance of sarcasm detection in the real world is stimulus for conducting this shared task in the recent years [17]. Participants were provided with the two datasets containing Youtube comments that are code-mixed in Dravidian languages. Focus of the shared tasks were to classify comments that are Tamil mixed with English and Malayalam mixed with English as sarcastic or not.

The principal aim of this shared task is to develop systems capable of identifying sarcasm in the given dataset containing social media posts that are code-mixed in Tamil and Malayalam. The average length of the post in the given corpora is one sentence, though there are few posts containing multiple sentences. Each comment is annotated as sarcastic or not. It is observed that the dataset contains more non-sarcastic posts than sarcastic ones, which shows the class imbalance issues prevalent in the reality. In this shared task, participants were given training, validation and test datasets. The challenge involved polarity classification in posts, where participants were asked to determine whether a given YouTube comment was sarcastic or not. To the extent of our knowledge, this is the first initiative to conduct shared task in identifying sarcasm in Dravidian code-mixed languages.

Various systems developed by the participating teams, along with the results are discussed in this paper. The organization of the paper is as follows: Section 2 covers the description of the shared task, Section 3 discusses the datasets used, Section 4 describes the various techniques used by the teams. Finally, Section 5 presents the results and rankings secured by each team, followed by concluding remarks in Section 6.

2. Task Description

The objective of this task is to identify sarcasm in texts that are code-mixed involving Dravidian languages. In particular, focus was given to code-mixed text in Tamil and Malayalam with English. The dataset is obtained from social media platforms, especially YouTube comments. The participants were given the challenge of predicting sarcasm in the given comments. Contestants of the task were given with training and validation datasets initially and the test datasets were released later for evaluation. This shared task on sarcasm detection in Dravidian languages is conducted as a series since last year and this is the second event of the series. Further information on the task is available in the Codalab site1.

3. Datasets

The datasets containing Tamil, Malayalam text mixed with English actually comprises of social media comments with diferent types of code-mixed sentences: inter-sentential switching, intra-sentential switching, and tag switching. Majority of the comments includes a combination of native or Roman script. It includes sentences framed using Tamil or Malayalam grammar mixed with English words. It also contains sentences following English grammar that are interspersed with Malayalam or Tamil vocabulary. The dataset is divided into training, validation and test sets. Training and validation sets were provided with class labels while test sets used for evaluation were given as unlabeled ones. The data distribution of training, validation, and test sets are given in Table 1. It is notable that the dataset contains a higher proportion of non-sarcastic comments compared to sarcastic ones, as shown in Table 2. This class imbalance skews the dataset, which participants needed to take into account while designing their classification systems.

Tamil mixed English Malayalam mixed English

Train 29,570 13,188

Dev 6,336 2,826

Test 6,338 2,826

Total 42,244 18,840

4. Methodology

Twenty three teams had actively participated in this shared task to identify sarcasm in two code-mixed languages, Tamil mixed English and Malayalam mixed English. The contestants have explored a variety of methodologies to classify the given comment as sarcastic or not [18].

Awsathama team [19] developed a sophisticated combination of transformer models and diverse data processing techniques to detect sarcasm in Dravidian languages of Tamil and Malayalam. The approach

1https://codalab.lisn.upsaclay.fr/competitions/19310

leveraged the strengths of mBERT, Indic-BERT, XLM-Roberta, and Muril to capture the nuanced linguistic features unique to these languages. A baseline LSTM with an attention mechanism was also utilized to establish reference performance. To enhance model eficacy, various data augmentation strategies were implemented. The process began with the original dataset, applying back translation specifically to the minor classes to achieve balance. Additionally, cross-lingual translation between Tamil and Malayalam was performed for these minor classes. This comprehensive data augmentation aimed to improve the models’ generalization abilities, resulting in more accurate sarcasm detection across diferent contexts and linguistic variations, ultimately securing a top F1 score of 0.74 in Tamil task and 0.75 in Malayalam.

Text_Catalysts team [20] investigated three models: DistilBERT, GRU, and LSTM. The study demonstrates that among the various models, DistilBERT is excellent in detecting sarcasm in Tamil literature. DistilBERT, a lightweight but efective model, is ideal for detecting sarcasm because of its ability to capture minor contextual elements in text. It yields an F1 score of 0.74 on the test set, making it the best performer.

Change_Makers team [21] explored the application of conventional algorithms, including logistic regression, random forest, and naive Bayes classifier, alongside the transformer-based BERT. Performance was evaluated across the datasets, focusing on key metrics like accuracy and F1-score. BERT demonstrated superior performance, efectively capturing contextual nuances in sarcasm detection, making it a more viable approach for multilingual and code-mixed environments. This team attained the maximum score of 0.74 in the Tamil-English sub task.

MUCS team [22] proposed two distinct models to perform sarcasm detection: i) A Long Short-Term Memory (LSTM) model using Keras embeddings, and ii) An mBERT+CNN model, which combines the Multilingual Bidirectional Encoder Representations from Transformers (mBERT) tokenizer for embeddings (a transformer-based approach) with a Convolutional Neural Network (CNN) for classification. The data imbalance prevalent in the dataset was handled by the team by applying text augmentation techniques using the Contextual Word Embeddings expanding the minority class. Among the proposed models, the mBERT+CNN model achieved superior performance, securing macro F1 scores of 0.74 for the Tamil-English subtask and 0.72 for the Malayalam-English subtask, ranking 1st and 2nd, respectively.

UMSNH_NLP team’s approach integrates bag-of-words and deep learning models to solve the task independently [23]. A new feature space is then constructed by leveraging the decision functions of the individual models. This feature space is fed into an XGBoost classifier for the final prediction. The generic text categorization system, FastText, achieves the best performance for both the Tamil and Malayalam subtasks, with 0.74 and 0.76 F1-scores.

IRLab@IITBHU team [24] explored a new technique for sarcasm identification using BERT with an additional neural network layer. It also employed ChatGPT for the same task and conducted a comparative study between GPT and BERT-based models. The experiment demonstrated that the BERT-based model efectively detected sarcasm, securing 0.74 F1 score for both Tamil and Malayalam code-mixed datasets, while GPT attained F1 score of 0.64 on the same datasets. These results reflected strong overall performance, placing the model third for Malayalam-English language pairs and first for Tamil-English language pairs.

Sarcasm_NLP team [25] tackled the sarcasm detection in Dravidian languages task by exploring the dificulties posed by code-mixing, dialectal variations, and the scarcity of annotated datasets. It investigates the use of three transformer-based models: (i) DistilBERT, (ii) GoogleBERT, and (iii) RoBERTa, to efectively capture the subtleties of sarcasm in these languages. Experimental results highlight the potential of transformers in achieving strong performance in multilingual sarcasm detection, with 0.73 and 0.72 F1-scores for the Tamil and Malayalam subtasks.

PixelPhrase team [26] proposed a model architecture consisted of a BERT encoder and a classification layer, generating a probability score indicating the likelihood of sarcasm. To assess the performance of the models, various validation metrics such as recall, precision, F1 score and AUC, were used. The results demonstrated that this method outperformed existing approaches. It obtained 0.73 F1 score on the Tamil test set and 0.72 on the Malayalam test set.

JUNLP_Amit Barman team [40] developed a hybrid model that involves CNNs, Bi-LSTM networks, and AdaBoost classifier for detecting sarcasm. This model demonstrated that the combination of deep learning-based features and the classical machine learning techniques for detecting sarcasm in a multilingual, code-mixed context. Performance, measured by F1-Score, 0.72 for the Malayalam dataset.

The team CodeSpark [27] has used advanced deep learning models, such bidirectional LSTMs, combined with specialized tokenization and embedding techniques, has resulted in substantial advancements in sarcasm detection. This system performed sarcasm identification with a Macro-F1 Score: 0.72 and secured the third position in the Tamil subtask and 0.74 in the Malayalam subtask.

KEC_Tech_Titan team [41] developed a system that achieved 6th place in Malayalam subtask of Dravidian track. Sarcasm, often dependent on context, tone, and cultural nuances, presents significant challenges for machine learning models. This team explored the usage of various machine learning and deep learning models for identifying sarcasm in Malayalam text. A range of models was utilized, including RoBERTa, CNN, Multi-layer Perceptron (MLP), Gated Recurrent Units (GRU), Random Forests (RF), Hidden Markov Models (HMM), K-Nearest Neighbors (KNN), Logistic Regression (LR) and Gaussian Mixture Models (GMM).

Beyond_Tech team [29] studied two deep learning techniques for sarcasm identification in various languages: a hybrid model that integrates several neural network models and a Bi-LSTM model. The hybrid model improved sarcasm detection by leveraging long-range dependencies alongside local feature extraction through the combination of multiple architectures. Comprehensive preprocessing techniques, including tokenization, padding, and label encoding, were applied to Malayalam and Tamil sarcasm datasets. The model proposed by the team outperformed the Bi-LSTM in accuracy and F1-scores, ranking 5th on the Tamil subtask with an Macro F1 score of 0.70 and 7th on the Malayalam subtask with an F1 score of 0.67.

SSN_Language Team approach utilizes a multilingual language model designed to handle codemixed and multilingual text to obtain relevant features from the input data [30]. The tokenized inputs are used to derive high-dimensional features, providing robust text representations. These features are then fed into three machine learning models: Multinomial Naive Bayes, Logistic Regression, and Random Forest Classifier. Each model is trained on Tamil and Malayalam code-mixed datasets, including code-mixed text, to classify the text into sarcastic and non-sarcastic categories, securing 0.7 and 0.62 F1 scores, respectively.

Code Crafters team [31] evaluated the eficiency of various models, including machine learning approaches like XGBoost, LightGBM, and CatBoost, and deep learning models such as LSTM and GRU. To address class imbalance, SMOTE was applied to the machine learning models and GRU, while sequence pre-padding was utilized for LSTM. The results indicate that SMOTE improves macro-average F1 scores and accuracy across most models,reaching a notable 0.69 macro F1 score.

CJM team [32] developed a system that utilized an MLP classifier, with custom-generated embeddings provided as input. A language-agnostic sentence transformer, which supports both Tamil and Malayalam, was used to generate text embeddings. Additionally, the LASER encoder pipeline was employed to create LASER embeddings for all texts. These two sets of embeddings were concatenated to form the ifnal set of embeddings, which was used to train the MLP classifier. The team gained 0.68 macro F1 score for Tamil and 0.70 for Malayalam.

MSD team [33] proposed a approach first translates multilingual Tamil mixed English and Malayalam mixed English texts into their corresponding English versions, followed by fine-tuning of the models, BERT and Xlm-RoBERTa for sarcasm identification. The method demonstrates promising results, achieving 0.68 F1-score for both BERT and Xlm-RoBERTa on Tamil-English posts, and a 0.71 macro F1-score for Malayalam-English posts.

The_Three_Musketeers team [34] attempted Various traditional machine learning approaches are employed to detect sarcastic content in Tamil and Malayalam comments. Among these, the logistic regression model achieved 0.68 F1 score for Tamil and 0.67 for Malayalam, demonstrating its capability in capturing the complex nuances of sarcasm detection in code-mixed Dravidian languages.

KEC_AIDS_79114 team [36] used TF-IDF for vectorization to convert given text it into numerical features. Four models were evaluated for their efectiveness in sarcasm detection: Decision Tree (DT), K-Nearest Neighbors (KNN), Support Vector Machine (SVM) and Logistic Regression (LR). Logistic Regression showed the best performance, achieving 0.61 F1-score for Tamil codemixed data and 0.58 for Malayalam codemixed data. The system demonstrated the robustness of the approach by securing a notable 9th place in a competitive sarcasm detection task.

The TextTitans team [37] utilized zero-shot capabilities of GPT-3.5 Turbo to carry out sarcasm detection through prompting. It capitalizes on the advantages of large-scale pretrained models, accommodates multilingual and code-mixed environments, lessens the reliance on extensive annotated datasets, facilitates quick experimentation, and ensures scalability across various linguistic contexts. The use of clear and concise prompts enabled the model to focus on the main task of sarcasm detection, leading to reliable and interpretable results. GPT model via prompting was run at three diferent temperature values namely-0.7,0.8,0.9. The system achieved 0.61 macro-F1 score for Tamil and 0.50 for Malayalam.

Tech_Chasers team [38] built a system for detecting sarcasm in Tamil code-mixed and Malayalam code-mixed sentences utilizes a neural network architecture that combines CNNs with Bi-LSTM layers . The system was trained with early stopping and checkpointing, achieving an F1-score of approximately 0.5 for both Tamil-English and Malayalam-English code-mixed data.

Tr4nslate team vectorized the text using the TF-IDF vectorizer. A meta-stacking ensemble model and an ensemble model comprising SVM, KNN, Logistic Regression, Decision Tree was used for Tamil task and produced a F1 score of 0.71. Random Forest classifier model used for the Malayalam dataset for detection and produced a score of 0.67.

Tech_Army_KEC team [28] combined traditional classifiers such as Support Vector Machines (SVM), Logistic Regression and Random Forest with advanced methods like CNNs, LSTM, and Transformerbased models like BERT and ALBERT. Further, Hierarchical Attention Networks (HAN) and Gradient Boosting techniques were also utilised. the system identified the sarcastic comments across the two languages yielding scores 0.7 for Tamil language and 0.67 for Malayalam language.

KEC_AI_InnovationEngineers team [35] used few machine learning methods such as: Logistic Regression, Support Vector Classifier (SVC), and Random Forest. Logistic Regression for binary classification, SVC for non-linear decision boundaries, and Random Forest for ensemble learning. The system was trained and evaluated for Tamil language task of sarcasm identification and produced a score of 0.67.

DLRG team used Multilingual Bert(mBERT) for the identification of the sarcasm contents on Tamil code-mixed language and obtained 0.49 F1 score.

JUNLP team [39] built a model using CNN followed by LSTM and AdaBoost classifier to identify sarcasm in the given Tamil code-mixed dataset and produced a F1-Score of 0.47.

SSNites used the mBERT model and fine-tuned it using given Tamil code-mixed and Malayam code-mixed data for sarcasm detection and produced the results, 0.24 and 0.57 respectively.

5. Results and Discussion

In FIRE 2024 , 23 teams contested in the shared task of detecting sarcasm in code-mixed data from Dravidian languages. The performance of all models was assessed using the F1 score [18]. The F1 score depicts the harmonic mean of recall and precision, and hence is very useful for evaluating sarcasm detection models. It also adequately balances false positives and false negatives. The results illustrate the complexity of sarcasm recognition across Tamil and Malayalam languages, stressing its importance in sentiment research. In spite of the problems of linguistic diversity, code mixing, and social inequities, all teams created systems with promising potential.

Recently, transformer-based linguistic models have shown outstanding eficiency utilizing powerful embedding representations and self-attention mechanisms, advancing the field of language comprehension. In this shared task, appreciably 6 teams “Awsathama", “Team_Catalysts", “Change_Makers", “MUCS", “UMNSH_NLP", “IRLab@IITBHU" secured the first place in the Tamil code-mix task, and the team “UMNSH_NLP" achieved first place in the Malayalam code-mix task. In the task of Tamil code-mixed text, six teams produced top-performing models with 0.74 Macro-F1 score.“Awsathama" team [19] designed a sophisticated ensemble of transformer models and data processing techniques to detect sarcasm in Malayalam and Tamil. This approach harnessed the power of mBERT, XLM-Roberta, Indic-BERT, and Muril to pinpoint the subtle linguistic nuances specific to these Dravidian languages. A baseline LSTM with an attention mechanism was also employed to set a performance benchmark. To further refine model efectiveness, several data augmentation strategies were applied. “UMNSH_NLP" team incorporated bag-of-words and deep learning models to tackle the task autonomously. A novel feature space was subsequently created by utilizing the decision functions of the individual models. This feature space was then inputted into an XGBoost classifier for the ultimate prediction. The generic text categorization system, FastText, attained the highest performance for both the Tamil and Malayalam code-mixed tasks, with 0.74 and 0.76 F1-scores, respectively. “Team_Catalysts" used DistilBERT, a lightweight but efective model in detecting sarcasm in Tamil literature due to its capacity to capture minor contextual elements in text. It achieved 0.74 F1 score on the test set, which is the highest among all teams. “Change_Makers" recommend BERT classifier for text classification tasks like sarcasm detection because of its strong design and capacity to handle large-scale datasets and has proven to yield the highest score for Tamil-English dataset.

“MUCS" used the hybrid model, the mBERT+CNN model, and achieved superior performance, securing 0.74 macro F1 score for the Tamil-English dataset and 0.72 for the Malayalam-English dataset, ranking 1st and 2nd, respectively. “IRLab@IITBHU" team demonstrated that the BERT-based model efectively detected sarcasm, achieving a 0.74 F1 score for Tamil and Malayalam code-mixed datasets. The rank lists of the participants in the Sarcasm identification task for Tamil and Malayalam code-mixed datasets are depicted in Tables 3, and 4.

6. Conclusion

The shared task for sarcasm detection in Dravidian languages emphasized the importance of language understanding in the social media communications. Many social media posts now mix diferent languages, a practice known as code-mixing. The objective of the shared task is to design and develop innovative models to detect the inherent sarcasm in the code-mixed data. This task encourages the research community to explore novel approaches to build robust and reliable sarcasm identification systems. The contestants were shared with two datasets in Tamil and Malayalam which are mixed with English, containing posts scrapped from social media platforms. 23 teams participated and have developed systems using diverse approaches, including traditional machine learning models, transformer based models and transfer learning. The results were evaluated and ranked based on the eficiency of the models using F1 score. The ideas used by each of the teams in building their systems were also highlighted which will aid future research in this direction.

Acknowledgments

Author Bharathi Raja Chakravarthi had supported this shared task through the research grant obtained from Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289_P2 (Insight_2). Rahul Ponnusamy and Prasanna Kumar Kumaresan also had rendered support through Science Foundation Ireland Centre for Research Training in Artificial Intelligence under Grant No. 18/CRT/6223 and the College of Science and Engineering, University of Galway, Ireland.

Declaration on Generative AI The author(s) have not employed any Generative AI tools.

[1] A. Ghosh, T. Veale, Fracking sarcasm using neural network, in: Proceedings of the 7th workshop on computational approaches to subjectivity, sentiment and social media analysis, 2016, pp. 161–169. [2] A. Joshi, P. Bhattacharyya, M. J. Carman, Automatic sarcasm detection: A survey, ACM Computing

Surveys (CSUR) 50 (2017) 1–22. [3] A. Reyes, P. Rosso, D. Buscaldi, From humor recognition to irony detection: The figurative language of social media, Data & Knowledge Engineering 74 (2012) 1–12. [4] T. Yue, R. Mao, H. Wang, Z. Hu, E. Cambria, Knowlenet: Knowledge fusion network for multimodal sarcasm detection, Information Fusion 100 (2023) 101921. [5] Y. Qiao, L. Jing, X. Song, X. Chen, L. Zhu, L. Nie, Mutual-enhanced incongruity learning network for multi-modal sarcasm detection, in: Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, 2023, pp. 9507–9515. [6] U. Shrawankar, C. Chandankhede, Sarcasm detection for workplace stress management, International Journal of Synthetic Emotions 10 (2019) 1–17. doi:10.4018/ijse.2019070101. [7] D. Ghosh, A. R. Fabbri, S. Muresan, Sarcasm analysis using conversation context, Computational Linguistics 44 (2018) 755–792. URL: https://aclanthology.org/J18-4009. doi:10.1162/coli_a_ 00336. [8] D. Jain, A. Kumar, G. Garg, Sarcasm detection in mash-up language using soft-attention based bidirectional lstm and feature-rich cnn, Applied Soft Computing 91 (2020) 106198. URL: https://www. sciencedirect.com/science/article/pii/S1568494620301381. doi:https://doi.org/10.1016/j. asoc.2020.106198. [9] B. R. Chakravarthi, Hope speech detection in youtube comments, Social Network Analysis and

Mining 12 (2022) 75. [10] B. R. Chakravarthi, R. Priyadharshini, S. Banerjee, M. B. Jagadeeshan, P. K. Kumaresan, R. Ponnusamy, S. Benhur, J. P. McCrae, Detecting abusive comments at a fine-grained level in a lowresource language, Natural Language Processing Journal 3 (2023) 100006. [11] B. R. Chakravarthi, M. B. Jagadeeshan, V. Palanikumar, R. Priyadharshini, Ofensive language identification in dravidian languages using mpnet and cnn, International Journal of Information Management Data Insights 3 (2023) 100151. [12] B. R. Chakravarthi, A. Hande, R. Ponnusamy, P. K. Kumaresan, R. Priyadharshini, How can we detect homophobia and transphobia? experiments in a multilingual code-mixed setting for social media governance, International Journal of Information Management Data Insights 2 (2022) 100119. [13] S. Divya, N. Sripriya, D. Evangelin, G. Saai Sindhoora, Opinion classification on code-mixed tamil language, in: International Conference on Speech and Language Technologies for Low-resource Languages, Springer, 2022, pp. 155–168. [14] S. Khanuja, S. Dandapat, A. Srinivasan, S. Sitaram, M. Choudhury, GLUECoS: An evaluation benchmark for code-switched NLP, in: D. Jurafsky, J. Chai, N. Schluter, J. Tetreault (Eds.), Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 3575–3585. URL: https://aclanthology.org/2020.acl-main.329. doi:10.18653/v1/2020.acl-main.329. [15] B. R. Chakravarthi, R. Priyadharshini, V. Muralidaran, N. Jose, S. Suryawanshi, E. Sherly, J. P.

McCrae, Dravidiancodemix: Sentiment analysis and ofensive language identification dataset for dravidian languages in code-mixed text, Language Resources and Evaluation 56 (2022) 765–806. [16] A. Hande, R. Priyadharshini, B. R. Chakravarthi, Kancmd: Kannada codemixed dataset for sentiment analysis and ofensive language detection, in: Proceedings of the Third Workshop on Computational Modeling of People’s Opinions, Personality, and Emotion’s in Social Media, 2020, pp. 54–63. [17] B. R. "Chakravarthi, S. N, B. B, N. K, T. Durairaj, R. Ponnusamy, P. K. Kumaresan, K. K. Ponnusamy, C. Rajkumar, Overview of sarcasm identification of dravidian languages in dravidiancodemix@ ifre-2023 (2023). [18] S. N, B. B, T. Durairaj, N. K, R. Ponnusamy, P. K. Kumaresan, K. K. Ponnusamy, C. Rajkumar,

Overview of sarcasm identification of dravidian languages in dravidiancodemix@fire-2024 , ???? [19] N. Narayan, S. Mohanty, Enhancing sarcasm detection in code-mixed dravidian texts using data augmentation and transformer models, in: Forum of Information Retrieval and Evaluation FIRE 2024 . [38] A. Chowdhury, S. Paul, S. Kundu, A. K. Thakur, A. Sarkar, A. R. Chaudhuri, A. Ray, D. Mitra, D. Saha, Sarcasm detection in dravidian languages, in: Forum of Information Retrieval and Evaluation FIRE - 2024, DAIICT , Gandhinagar, 2024 . [39] P. Maity, D. Saha, S. Das, S. K. Mahata, D. Das, A hybrid approach to sarcasm detection in dravidian code-mixed texts, in: Forum of Information Retrieval and Evaluation FIRE - 2024, DAIICT , Gandhinagar, 2024 . [40] A. Barman, A. Mandal, S. K. Naskar, Sarcasm or serious? sarcasm detection in code-mixed dravidian languages, in: Forum of Information Retrieval and Evaluation FIRE - 2024, DAIICT , Gandhinagar, 2024 . [41] M. Subramanian, A. S, D. P, D. S, K. S V, Investigation of machine learning and transformer models for sarcasm detection in dravidian languages, in: Forum of Information Retrieval and Evaluation FIRE - 2024, DAIICT , Gandhinagar, 2024 .

2024 , DAIICT , Gandhinagar, 2024 . [20]

Shanmugavadivel , S. K, S. Janani J S , R. K , Leveraging transfer learning and deep recurrent

Evaluation

FIRE

- 2024 , DAIICT , Gandhinagar, 2024 . [21]

Shanmugavadivel , P. C, V. L, S. S,

Leveraging machine learning and bert for sarcasm detection

in text, in: Forum of Information Retrieval and Evaluation FIRE -

2024 , DAIICT , Gandhinagar,

2024 . [22] S. D , K. G , H. L. Shashirekha , Unmasking sarcasm: Exploring mbert+cnn and lstm models for

and Evaluation

FIRE

- 2024 , DAIICT , Gandhinagar, 2024 . [23]

Cerda-Flores ,

Castro-Pineda ,

M. G.

Juarez ,

R. I.

Hernandez-Mazariegos ,

Cerda-Jacobo ,

2024 , DAIICT , Gandhinagar, 2024 . [24]

Chanda ,

Tewari ,

Mukherjee ,

Pal , Leveraging chatgpt and xlm-roberta for sarcasm

FIRE - 2024 , DAIICT , Gandhinagar, 2024 . [25]

Chauhan ,

Kumar , A transformer-based model for detecting multilingual sarcasm in social

media posts, in: Forum of Information Retrieval and Evaluation FIRE -

2024 , DAIICT , Gandhinagar,

2024 . [26] S. S , J. S, K. J. P , Sarcasm unveiled : Advanced detection techniques for tamil and malayalam using

multi modal approaches, in: Forum of Information Retrieval and Evaluation FIRE -

2024 , DAIICT ,

Gandhinagar , 2024 . [27] S. B K , S. Priyaa G K, A . P, C. Mahibha, Sarcasm detection in dravidian languages using bi-directional

lstm, in: Forum of Information Retrieval and Evaluation FIRE -

2024 , DAIICT , Gandhinagar, 2024 . [28]

Subramanian , A. A , A. T , A. M, K. S

, Sarcasm detection in dravidian languages using machine

learning and transformer models, in: Forum of Information Retrieval and Evaluation FIRE -

2024 ,

DAIICT , Gandhinagar, 2024 . [29]

Shanmugavadivel ,

Subramanian , S. R , M. Sameer B , M. K , Bi-lstm and hybrid model based

Information

Retrieval and Evaluation FIRE - 2024 , DAIICT , Gandhinagar, 2024 . [30] M. A , K. S , P. Priya B , B. B, Sarcasm detection and identification of dravidian language using

machine learning approach, in: Forum of Information Retrieval and Evaluation FIRE -

2024 ,

DAIICT , Gandhinagar, 2024 . [31]

Shanmugavadivel , N. K, S. S, Enhanced sarcasm detection in code-mixed tamil-english text

Evaluation

FIRE

- 2024 , DAIICT , Gandhinagar, 2024 . [32]

Mahibha ,

Shimi ,

Thenmozhi , Sarcasm detection from dravidian language text , in: Forum

of Information Retrieval and Evaluation FIRE -

2024 , DAIICT , Gandhinagar, 2024 . [33]

Kumar ,

Kumar , Msd: Multilingual sarcasm detection using deep learning-based model , in:

Forum of Information Retrieval and Evaluation FIRE -

2024 , DAIICT , Gandhinagar, 2024 . [34]

Karthik ,

Sreekumar ,

Shyam Potta ,

Thenmozhi , Sarcasm identification of dravidian

languages malayalam and tamil, in: Forum of Information Retrieval and Evaluation FIRE -

2024 ,

DAIICT , Gandhinagar, 2024 . [35]

Shanmugavadivel ,

Murugan

V , P. Sree

M , P. Chinnappan

D , Automated sarcasm identification

Retrieval and Evaluation FIRE -

2024 , DAIICT , Gandhinagar, 2024 . [36] K. S V , M. Subramnanian , P. S P , V. S H , A. M,

Detecting sarcasm in social media text using

Evaluation

FIRE

- 2024 , DAIICT , Gandhinagar, 2024 . [37]

Deroy ,

Maity , Youtube comments decoded: Leveraging llms for low resource language

classification, in: Forum of Information Retrieval and Evaluation FIRE -

2024 , DAIICT , Gandhinagar,