=Paper=
{{Paper
|id=Vol-3681/T5-1
|storemode=property
|title=Overview of Sarcasm Identification of Dravidian Languages in DravidianCodeMix@FIRE-2023
|pdfUrl=https://ceur-ws.org/Vol-3681/T5-1.pdf
|volume=Vol-3681
|authors=Bharathi Raja Chakravarthi,Sripriya N,Bharathi B,Nandhini K,Subalalitha Chinnaudayar Navaneethakrishnan,Thenmozhi Durairaj,Rahul Ponnusamy,Prasanna Kumar Kumaresan,Kishore Kumar Ponnusamy,Charmathi Rajkumar
|dblpUrl=https://dblp.org/rec/conf/fire/ChakravarthiNBK23
}}
==Overview of Sarcasm Identification of Dravidian Languages in DravidianCodeMix@FIRE-2023==
Overview of Sarcasm Identification of Dravidian Languages in DravidianCodeMix@FIRE-2023⋆ Bharathi Raja Chakravarthi1,∗,† , Sripriya N2,† , Bharathi B2,† , Nandhini K3,† , Subalalitha Chinnaudayar Navaneethakrishnan4,† , Thenmozhi Durairaj2,† , Rahul Ponnusamy1,† , Prasanna Kumar Kumaresan1,† , Kishore Kumar Ponnusamy5,† and Charmathi Rajkumar6,† 1 University of Galway, Galway, Ireland 2 Sri Sivasubramaniya Nadar College of Engineering, Tamil Nadu, India 3 Central University of Tamil Nadu, Tamil Nadu, India 4 Department Of Computer Science Engineering, SRM Institute Of Science And Technology, Tamil Nadu, India 5 Digital University of Kerala, Kerala, India 6 The American College, Tamil Nadu, India Abstract Sarcasm identification in code-mixed languages is a crucial task in natural language processing, given the prevalence of multilingual and multicultural communication on social media platforms. This overview paper provides an examination of the sarcasm identification shared task held as part of Dravidian- CodeMix@FIRE-2023. The primary objective of this task was to identify instances of sarcasm within a dataset of code-mixed comments in Tamil-English and Malayalam-English, sourced from social media platforms. A total of 11 teams participated in this shared task, which focused on two Dravidian languages: Tamil and Malayalam. The central aim was to predict whether a given comment contained sarcastic or non-sarcastic content. This analysis encompasses a comprehensive evaluation of the various models utilized by the participating teams and delves into the specific challenges encountered when attempting to detect sarcasm in code-mixed text. The performance of the systems submitted was evaluated in terms of macro-F1 score. The paper also provides a thorough examination of all the submissions made during this task. Keywords Sarcasm Detection, Dataset Creation, Classification Task, Code-Mixing, Dravidian Languages Forum for Information Retrieval Evaluation, December 15-18, 2023, India ∗ Corresponding author. Envelope-Open bharathi.raja@insight-centre.org (B. R. Chakravarthi); sripriyan@ssn.edu.in (S. N); bharathib@ssn.edu.in (B. B); nandhinikumaresh@cutn.ac.in. (N. K); subalaln@srmist.edu.in (S. C. Navaneethakrishnan); theni_d@ssn.edu.in (T. Durairaj); r.ponnusamy1@universityofgalway.ie (R. Ponnusamy); P.Kumaresan1@universityofgalway.ie (P. K. Kumaresan); kishorep161002@gmail.com (K. K. Ponnusamy); charmathirajkumar@gmail.com (C. Rajkumar) Orcid 0000-0002-0877-7063 (B. R. Chakravarthi); 0000-0003-2070-418X (S. N); 0000-0001-7279-5357 (B. B); 0000-0003-4778-6525 (N. K); 0000-0002-8920-707X (S. C. Navaneethakrishnan); 0000-0003-0681-6628 (T. Durairaj); 0000-0001-8023-7742 (R. Ponnusamy); 0000-0003-2244-246X (P. K. Kumaresan); 0000-0001-9621-668X (K. K. Ponnusamy); 0000-0002-2531-1070 (C. Rajkumar) © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings 1. Introduction Sarcasm is a form of verbal expression characterized by the use of language that typically conveys a meaning contrary to the words spoken, posing a significant challenge for machines in deciphering its true intent [1]. It is primarily discernible through the tone of voice, marked by a subtle undertone of irony, and heavily reliant on context, rendering it a complex subject for computational analysis. Furthermore, sarcasm often employs positive vocabulary to convey negative sentiments, further adding to the perplexity faced by sentiment analysis models [2] [3] [4]. Sarcasm detection has gained great attention in the field of natural language processing as even recent researches are advancing to focus on multimodal sarcasm detection [5]. [6]. Sarcasm is regarded as one of the most challenging challenges for sentiment analysis systems. It indirectly communicates an opinion, with the intended meaning diverging from the literal one [7]. Identifying sentiment, offensive content or sarcasm within social media text remains a persistent challenge when dealing with Dravidian languages [8][9] [10]. There is an increasing demand for sarcasm and sentiment detection on social media texts which are largely code-mixed for Dravidian languages [11][12]. Code-mixing is a prevalent phenomenon in a multilingual community and the code-mixed texts are sometimes written in non-native scripts. Systems trained on monolingual data fail on code-mixed data due to the complexity of code-switching at different linguistic levels in the text. This shared task presents a new gold standard corpus for sarcasm and sentiment detection of code-mixed text in Dravidian languages (Tamil-English and Malayalam-English). The Tamil language is spoken by Tamil people in India, Sri Lanka, and by the Tamil diaspora around the world, with official recognition in India, Sri Lanka, and Singapore. Malayalam is a Dravidian language spoken predominantly by the people of Kerala, India. The Tamil script evolved from the Tamili script, Vatteluttu alphabet, and Chola-Pallava script [13]. It has 12 vowels, 18 consonants, and 1 āytam (voiceless velar fricative). Minority languages such as Saurashtra, Badaga, Irula, and Paniya are also written in the Tamil script. However, social media users often mix Roman script for typing because it is easy to input. Hence, the majority of the data available in social media for these under-resourced languages are code-mixed[14]. The goal of this task is to identify sarcasm and sentiment polarity of the code-mixed dataset of comments/posts in Tamil-English and Malayalam-English collected from social media. The comment/post may contain more than one sentence but the average sentence length of the corpora is 1. Each comment/post is annotated with sentiment polarity at the comment/post level. This dataset also has class imbalance problems depicting real-world scenarios. Our proposal aims to encourage research that will reveal how sarcasm is expressed in code-mixed scenarios on social media. The participants were be provided with development, training, and test dataset. This is a comment-level polarity classification task. Given a Youtube comment, systems have to classify it into sarcasm or not sarcasm. As far as we know, this is the first shared task on Sarcasm detection in Dravidian Code-Mixed text. This work discusses the various models submitted to the shared task and the results of the participating teams. The rest of the article is orchestrated as follows: Section 2 describes the shared task. Section 3 discusses about the dataset. Section 4 summarizes the systems and the methodologies used in each participating team for both the shared sub-tasks and highlights the features of each model. The analysis of the results and findings of the methodologies submitted by the participants are presented in Section 5. Concluding remarks are presented in Section 6. 2. Task Description This shared task focuses on sarcasm detection in code-mixed text in Dravidian languages. In this task, there are two languages: Tamil-English and Malayalam-English. The comments used in this task are collected from social media sources. The goal of the proposed shared task is to predict whether the given comments are sarcastic/non-sarcastic. Participants will be granted access to development, training, and test datasets. This is the first shared task on sarcasm detection in Dravidian code-mixed text. Find more information about this task on the Codalab site1 . 3. Dataset The Tamil-English and Malayalam-English dataset contains social media comments with all three types of code-mixed sentences: inter-Sentential switch, intra-Sentential switch, and tag switching. Most comments were written in native script and the Roman script with either Tamil/Malayalam grammar with English lexicon or English grammar with Tamil/Malayalam lexicon. Some comments were written in Tamil/Malayalam script with English expressions in between. The dataset is divided into training and validation sets for both Tamil and Malayalam. Additionally, test sets are provided both with and without labels for these languages. Table1 gives the distribution of the train, validation and test sets. It was observed that there are more non-sarcastic comments than sarcastic ones in the dataset which is shown in Table2. This makes the datasets imbalanced and skewed more towards one class than the other, which the participants had to consider when developing their classification systems. Table 1 Data Distribution Train Dev Test Total Tamil-English 27,036 6,759 8,449 42,244 Malayalam-English 12,057 3,015 3,768 18,840 4. Methodology In this shared task, two languages, namely Tamil-English and Malayalam-English, are involved, with 11 participating teams. The participants have employed a variety of methods to distinguish between sarcastic and non-sarcastic text. The methods included the utilization of several models, such as BERT [15], DistilBERT [16], XLM-RoBERTa [17], SVM, Text Augmentation (TA), Multilingual BERT Model [18], IndicBERT [19], Linear SVC, KNN, ALBERT transformer model 1 https://codalab.lisn.upsaclay.fr/competitions/13540 Table 2 Class Distribution Tamil-English Train Dev Test Sarcastic 7,170 1,820 2,263 Non-Sarcastic 19,866 4,939 6,186 Malayalam-English Train Dev Test Sarcastic 2,259 588 685 Non-Sarcastic 9,798 2,427 3,083 [20], MLP Classifier, MURIL [21], BiLSTM, and 2-way-20-shot. These models were applied to detect sarcasm in the text data. To enhance the performance of the models for this specific task, the participants implemented various techniques, including instruction tuning and alignment training. The Team “SSN-FeaturesAlpha” [22] submitted the system for both Tamil-English and Malayalam-English. They used BERT, DistilBERT, XLM-RoBERTa, SVM, and TF-IDF models, and they achieved maximum F1 scores of 0.68 in Tamil and 0.63 in Malayalam in different models. The Team “MUCS”[23] submitted the system for both Tamil-English and Malayalam-English. They used Random Forest Classifier (RF), Support Vector Classifier (SVC) with hard voting, and Deep Learning (DL) models (Convolutional Neural Network (CNN)). Transfer Learning (TL) based models (Multilingual Bidirectional Encoder Representations from Transformers (mBert) and Distilled version of Multilingual Bert (mDistilBert) for Malayalam and Tamil code-mixed texts respectively. The Text Augmentation (TA) technique achieved a maximum F1 score for Tamil of 0.70 and Malayalam of 0.71. The Team “IRLabIITBHU”[24] submitted the system for both Tamil-English and Malayalam- English. They used a pre-trained Multilingual BERT model and achieved a maximum F1 score of 0.72 for both Tamil and Malayalam. The Team “TechWhiz”[25] submitted the system for both Tamil-English and Malayalam- English. They used Transformer models (IndicBERT, mBERT, DistilBERT). The IndicBERT model has exhibited superior performance than other models and achieved a maximum F1 score of 0.66 for Tamil and 0.63 for Malayalam. The Team “ABC”[26] submitted the system for both Tamil-English and Malayalam-English. They used TFIDFVectorizer to convert text data into numerical form and a stacking classifier combining LinearSVC, RandomForest, and KNN as base models, with logistic regression as the meta classifier. The weighted average F1 score was 0.73 for Tamil and 0.72 for Malayalam. The Team “ENDEAVOUR”[27] submitted the system for both Tamil-English and Malayalam- English. They performed experiments using the transfer learning model and observed that the multilingual-BERT model gave the best result. The F1 score was 0.70 for Tamil and 0.53 for Malayalam. The Team “Ramyasiva”[28] submitted the system for both Tamil-English and Malayalam- English. They have used the ALBERT transformer model. The weighted average F1 score was 0.71 for Tamil and 0.52 for Malayalam. The Team “SSNCSE”[29] submitted the system for both Tamil-English and Malayalam- English. They have used the Count Vectorizer with MLP Classifier and Logistic Regression, TF-IDF Vectorizer with MLP Classifier, and TF-IDF Vectorizer with Random Forest Classifier. The F1 score was 0.73 for Tamil and 0.74 for Malayalam. The Team “hatealert”[30] submitted the system for both Tamil-English and Malayalam- English. They have used the MURIL and m-BERT Models, performing model MURIL achieved the F1 score of 0.74 for Tamil and 0.73 for Malayalam. The Team “YenCS”[31] submitted the system for both Tamil-English and Malayalam-English. They have used the Bidirectional Long Short Term Memory(BiLSTM) model to achieve the weighted F1-scores of 0.68 for Tamil and 0.63 for Malayalam. The Team “Hydrangea”[32] submitted the system for both Tamil-English and Malayalam- English. They have implemented three models: BERT, XLM-RoBERTa, and 2-way-20-shot learning to detect sarcasm. The 2-way-20-shot approach performs better to achieve the weighted F1-scores of 0.69 for Tamil and 0.57 for Malayalam. 5. Results and Discussion Overall, 11 teams participated in the shared task. In this collaborative challenge, the performance of all models is assessed using the F1-score. The F1 score, which represents the harmonic balance of precision and recall, is particularly well-suited for evaluating sarcasm detection models. It strikes an effective balance between false positives and false negatives. The results of all the systems underscore the intricacies involved in the task of sarcasm identification across different languages and its substantial relevance within sentiment analysis. Despite the substantial challenges posed by language diversity, code-mixing, and disparities in social class, all the teams developed systems those exhibited promising potential. Lately, transformer-based language models have showcased impressive abilities, leveraging their powerful embedding representations and self-attention mechanisms, thus pushing the boundaries of language comprehension. The Team “hatealert” took first place in the Tamil- English Task, and the Team “SSNCSE” took first place in the Malayalam-English Task. In the Task of Tamil-English, the top-performing model was achieved by the team “hate alert.”. They used MURIL to achieve the Macro-F1 score of 0.74. They utilized transformer-based models like m-BERT and MURIL. Their results illustrated that MURIL surpasses m-BERT in various metrics in both languages. The enhanced performance of MURIL can be credited to its dedicated pre-training in Indian languages and their transliterations. In the Task of Malayalam-English, the top-performing model was achieved by the team “SSNCSE.” They used a Count Vectorizer with an MLP Classifier and Logistic Regression to achieve the Macro-F1 score of 0.74. The rank list obtained by the participants for Tamil and Malayalam is represented in Tables 3, and 4. 6. Conclusion To summarize this task, sarcasm identification of Dravidian languages. The shared task aimed to identify sarcasm and sentiment polarity of the code-mixed dataset of comments/posts in Table 3 Rank list for Task: Tamil-English Team Name F1-score Rank hatealert [30] 0.74 1 ABC [26] 0.73 2 SSNCSE [29] 0.73 2 IRLabIITBHU [24] 0.72 3 ramyasiva [28] 0.71 4 ENDEAVOUR [27] 0.70 5 MUCS [23] 0.70 5 Hydrangea [32] 0.69 6 SSN-FeaturesAlpha [22] 0.68 7 YenCS [31] 0.68 7 TechWhiz [25] 0.66 8 Table 4 Rank list for Task: Malayalam-English Team Name F1-score Rank SSNCSE [29] 0.74 1 hatealert [30] 0.73 2 ABC [26] 0.72 3 IRLabIITBHU [24] 0.72 3 MUCS [23] 0.71 4 SSN-FeaturesAlpha [22] 0.63 5 TechWhiz [25] 0.63 5 YenCS [31] 0.63 5 Hydrangea [32] 0.57 6 ENDEAVOUR [27] 0.53 7 ramyasiva [28] 0.52 8 Tamil-English and Malayalam-English collected from social media. The dataset had class imbalance problems depicting real-world scenarios. This research emphasizes the importance of addressing linguistic variations to gain accurate insights from online content. It also sheds light on the intricate multilingual aspects of sarcasm detection and lays the groundwork for future advancements in sentiment analysis, such as implementing restrictions on posting sarcastic comments under online videos. The similar patterns observed in the trials conducted in Tamil and Malayalam underscore the challenges presented by code-mixing and linguistic characteristics in both languages, underscoring the need for multilingual approaches to address the same tasks in multiple languages. The participants have developed various models based on machine learning, deep learning, and natural language processing. The submissions were ranked based on the performance of the models using F1 score. Acknowledgments References [1] P. Bhattacharyya, A. Joshi, Computational sarcasm, in: A. Birch, N. Schneider (Eds.), Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts, Association for Computational Linguistics, Copenhagen, Denmark, 2017. URL: https://aclanthology.org/D17-3002. [2] D. Šandor, M. Bagic Babac, Sarcasm detection in online comments using machine learning, Information Discovery and Delivery (2023). doi:10.1108/IDD- 01- 2023- 0002 . [3] D. Ghosh, A. R. Fabbri, S. Muresan, Sarcasm analysis using conversation context, Computational Linguistics 44 (2018) 755–792. URL: https://aclanthology.org/J18-4009. doi:10.1162/coli_a_00336 . [4] A. Joshi, P. Bhattacharyya, M. J. Carman, Automatic sarcasm detection: A survey, ACM Computing Surveys 50 (2017) 73:1–73:22. URL: https://dl.acm.org/citation.cfm?id=3145473. 3124420. [5] T. Yue, R. Mao, H. Wang, Z. Hu, E. Cambria, Knowlenet: Knowledge fusion network for multimodal sarcasm detection, Information Fusion 100 (2023) 101921. [6] Y. Qiao, L. Jing, X. Song, X. Chen, L. Zhu, L. Nie, Mutual-enhanced incongruity learning network for multi-modal sarcasm detection, in: Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, 2023, pp. 9507–9515. [7] P. Liu, W. Chen, G. Ou, T. Wang, D. Yang, K. Lei, Sarcasm detection in social media based on imbalanced classification, in: Web-Age Information Management: 15th International Conference, WAIM 2014, Macau, China, June 16-18, 2014. Proceedings 15, Springer, 2014, pp. 459–471. [8] B. R. Chakravarthi, Hope speech detection in youtube comments, Social Network Analysis and Mining 12 (2022) 75. [9] B. R. Chakravarthi, R. Priyadharshini, S. Banerjee, M. B. Jagadeeshan, P. K. Kumaresan, R. Ponnusamy, S. Benhur, J. P. McCrae, Detecting abusive comments at a fine-grained level in a low-resource language, Natural Language Processing Journal 3 (2023) 100006. [10] B. R. Chakravarthi, M. B. Jagadeeshan, V. Palanikumar, R. Priyadharshini, Offensive language identification in Dravidian languages using MPNet and CNN, International Journal of Information Management Data Insights 3 (2023) 100151. [11] B. R. Chakravarthi, A. Hande, R. Ponnusamy, P. K. Kumaresan, R. Priyadharshini, How can we detect homophobia and transphobia? experiments in a multilingual code-mixed setting for social media governance, International Journal of Information Management Data Insights 2 (2022) 100119. [12] S. Divya, N. Sripriya, D. Evangelin, G. Saai Sindhoora, Opinion classification on code-mixed Tamil language, in: International Conference on Speech and Language Technologies for Low-resource Languages, Springer, 2022, pp. 155–168. [13] B. R. Chakravarthi, R. Priyadharshini, V. Muralidaran, N. Jose, S. Suryawanshi, E. Sherly, J. P. McCrae, Dravidiancodemix: Sentiment analysis and offensive language identification dataset for dravidian languages in code-mixed text, Language Resources and Evaluation 56 (2022) 765–806. [14] A. Hande, R. Priyadharshini, B. R. Chakravarthi, Kancmd: Kannada codemixed dataset for sentiment analysis and offensive language detection, in: Proceedings of the Third Workshop on Computational Modeling of People’s Opinions, Personality, and Emotion’s in Social Media, 2020, pp. 54–63. [15] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional transformers for language understanding, arXiv preprint arXiv:1810.04805 (2018). [16] V. Sanh, L. Debut, J. Chaumond, T. Wolf, Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter, ArXiv abs/1910.01108 (2019). [17] A. Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán, E. Grave, M. Ott, L. Zettlemoyer, V. Stoyanov, Unsupervised cross-lingual representation learning at scale, CoRR abs/1911.02116 (2019). URL: http://arxiv.org/abs/1911.02116. arXiv:1911.02116 . [18] J. Devlin, M. Chang, K. Lee, K. Toutanova, BERT: pre-training of deep bidirectional transformers for language understanding, CoRR abs/1810.04805 (2018). URL: http://arxiv. org/abs/1810.04805. arXiv:1810.04805 . [19] D. Kakwani, A. Kunchukuttan, S. Golla, G. N.C., A. Bhattacharyya, M. M. Khapra, P. Ku- mar, IndicNLPSuite: Monolingual Corpora, Evaluation Benchmarks and Pre-trained Multilingual Language Models for Indian Languages, in: Findings of EMNLP, 2020. [20] Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, R. Soricut, ALBERT: A lite BERT for self-supervised learning of language representations, CoRR abs/1909.11942 (2019). URL: http://arxiv.org/abs/1909.11942. arXiv:1909.11942 . [21] S. Khanuja, D. Bansal, S. Mehtani, S. Khosla, A. Dey, B. Gopalan, D. K. Margam, P. Aggarwal, R. T. Nagipogu, S. Dave, S. Gupta, S. C. B. Gali, V. Subramanian, P. Talukdar, Muril: Multilingual representations for indian languages, 2021. arXiv:2103.10730 . [22] V. Indirakanth, D. Udayakumar, T. Durairaj, B. B, Sarcasm identification of Dravidian languages (Malayalam and Tamil), in: Forum of Information Retrieval and Evaluation FIRE - 2023, 2023. [23] N. N, V. V, A. Hegde, H. L. Shashirekha, Learning models with text augmentation for sarcasm detection in Malayalam and Tamil code-mixed texts, in: Forum of Information Retrieval and Evaluation FIRE - 2023, 2023. [24] S. Chanda, A. Mishra, S. Pal, Sarcasm detection in Tamil and Malayalam Dravidian code-mixed text, in: Forum of Information Retrieval and Evaluation FIRE - 2023, 2023. [25] M. M, K. Akshatra. M, T. J, C. Mahibha, T. Durairaj, Sarcasm detection in Dravidian languages using transformer models, in: Forum of Information Retrieval and Evaluation FIRE - 2023, 2023. [26] P. Shetty, Sarcasm identification in Dravidian languages Tamil and Malayalam, in: Forum of Information Retrieval and Evaluation FIRE - 2023, 2023. [27] P. Ram N, M. T, K. V, M. S, M. B, Sarcasm identification in codemix Dravidian languages, in: Forum of Information Retrieval and Evaluation FIRE - 2023, 2023. [28] R. Sivakumar, C. Mahibha, B. Jenefer, Identifying the type of sarcasm in Dravidian languages using deep-learning models, in: Forum of Information Retrieval and Evaluation FIRE - 2023, 2023. [29] D. Krishnan, K. Dharanikota, B. Bharathi, Cross-linguistic sarcasm detection in tamil and malayalam: A multilingual approach, in: Forum of Information Retrieval and Evaluation FIRE - 2023, 2023. [30] A. B. Bhaumik, M. Das, Sarcasm detection in Dravidian code-mixed text using transformer- based models, in: Forum of Information Retrieval and Evaluation FIRE - 2023, 2023. [31] A. M. D, P. R. Hegde, Unmasking sarcasm: Sarcastic language detection with BiLSTMs, in: Forum of Information Retrieval and Evaluation FIRE - 2023, 2023. [32] S. Thirumoorthy, M. N. R, T. Durairaj, R. Ratnavel, A few shot learning to detect sarcasm in Tamil and Malayalam code mixed data, in: Forum of Information Retrieval and Evaluation FIRE - 2023, 2023.