<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Forum for Information Retrieval Evaluation, December</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>DravidianCodeMix 2025: Ofensive Content Classification in Kannada-Tulu Code-Mixed Texts Using Classical Machine Learning</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Santhiya P</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anand S</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Archna V</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Debehaa J</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Kongu Engineering College</institution>
          ,
          <addr-line>Tamil Nadu</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>1</volume>
      <fpage>7</fpage>
      <lpage>20</lpage>
      <abstract>
        <p>Detecting abusive and ofensive content on social media remains a growing challenge, particularly for lowresource languages such as Kannada and Tulu. The complexity increases due to factors like code-mixing, informal writing, and limited availability of annotated data. In this study, we evaluate machine learning models including Linear Support Vector Machines (SVM), Logistic Regression (LR), and Multinomial Naive Bayes (NB), along with an Ensemble Voting Classifier, for ofensive content detection in Kannada and Tulu YouTube comments. The text data was preprocessed by removing noise such as URLs and non-language symbols, followed by TF-IDF-based feature extraction. Performance was assessed using accuracy on development datasets. For Tulu, the Ensemble model achieved the best accuracy of 0.79, while for Kannada, Logistic Regression provided the highest accuracy of 0.68. These results highlight the potential of ensemble approaches in handling low-resource and code-mixed data. Future work will explore deep learning architectures such as transformer-based models (e.g., BERT, IndicBERT) and data augmentation strategies to further enhance detection accuracy.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>The rise of social media has revolutionized how individuals communicate, share opinions, and engage
with global communities. However, this unprecedented connectivity comes at the cost of an alarming
increase in abusive language, particularly targeting women. Abusive language not only perpetuates
gender inequality, but also has severe psychological and social consequences. The anonymity ofered by
online platforms further emboldens individuals to engage in such harmful behavior. Hence, developing
automated systems to monitor and flag abusive language has become a pressing necessity. Manual
moderation alone is neither scalable nor eficient given the vast volume of user-generated content
produced every second. Advanced computational techniques are therefore essential to ensure safe
digital spaces for vulnerable groups. Addressing this issue requires eficient tools to detect and mitigate
this content efectively.</p>
      <p>
        Previous works on abusive text detection have predominantly focused on English, leaving
lowresource languages like Kannada and Tulu underexplored. Moreover, the code-mixed nature of these
languages further complicates the task, as traditional monolingual models fail to handle linguistic
complexities inherent in such data. In multilingual societies, code-switching is not only common but
also adds layers of ambiguity, making ofensive content harder to identify. Furthermore, the scarcity
of annotated datasets in Kannada and Tulu presents an additional barrier to building robust systems.
Building on the growing body of research on ofensive language detection, this study proposes [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] the
application of machine learning models for classifying Kannada and Tulu social media comments as
abusive or non-abusive.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Literature survey</title>
      <p>The rapid growth of social media platforms has transformed global communication but has also fostered
the spread of hate speech, abusive, and ofensive content targeting individuals and communities. This has
raised the demand for automated methods of ofensive language detection, particularly for low-resource
and code-mixed languages such as Tamil, Malayalam, and Kannada.</p>
      <sec id="sec-2-1">
        <title>2.1. Early Approaches with Classical Machine Learning</title>
        <p>
          Early research on ofensive language detection mostly relied on classical machine learning methods
such as Support Vector Machines (SVM), Naïve Bayes, and Decision Trees. These models often used
surface-level features like n-grams and TF–IDF. Hasan et al. [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] proposed the OLF-ML framework,
which integrated preprocessing with traditional classifiers for detecting and categorizing ofensive
language. However, these models struggled with noisy user-generated data, spelling variations, and
code-mixing.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Deep Learning Approaches</title>
        <p>
          Deep learning models have shown stronger performance by learning contextual representations.
Agrawal and Awekar demonstrated the use of deep learning for cyberbullying detection across platforms.
Gao and Huang used context-aware models for hate speech detection, highlighting the importance
of semantic understanding. Manukonda [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] applied BiLSTM with subword tokenization for detecting
homophobia and transphobia, showing that subword-level representations are efective in handling
spelling variation in low-resource languages.
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Transformer-Based Approaches</title>
        <p>
          Transformer-based models have become the standard for ofensive content detection. Kakati and
Dandotiya employed MuRIL and DConvBLSTM ensembles for hate speech detection in Indian languages,
achieving improved performance in code-mixed scenarios. Nalini et al.provided a comprehensive review,
noting that transformers like IndicBERT and XLM-R consistently outperform traditional approaches.
Rahman et al. [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] demonstrated the efectiveness of transformer models in detecting abusive Tamil
text. Chakravarthi et al. [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] introduced a multilingual MPNet+CNN fusion model that achieved high
F1-scores for Tamil, Malayalam, and Kannada.
        </p>
      </sec>
      <sec id="sec-2-4">
        <title>2.4. Ofensive Detection in Dravidian Languages</title>
        <p>
          Several shared tasks have focused on ofensive language in Dravidian languages. Chakravarthi et al. [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]
reported results for Tamil, Malayalam, and Kannada ofensive detection, providing benchmark datasets.
The DravidianCodeMix dataset [7] further extended sentiment and ofensive classification resources.
Ranasinghe and Zampieri evaluated multilingual approaches for Indian languages, showing the benefits
of transfer learning. More recently, Chakravarthi et al. provided the overview of FIRE-2025 shared
task on ofensive language identification in Dravidian code-mixed languages, establishing a common
evaluation benchmark.
        </p>
      </sec>
      <sec id="sec-2-5">
        <title>2.5. Surveys and Reviews</title>
        <p>Multiple surveys have highlighted the challenges in abusive language detection. Schmidt and Wiegand
presented an early survey of hate speech detection using NLP techniques. Nandi et al. [8] surveyed
hate speech detection in Indian languages, noting dificulties with under-resourced settings. Salminen
analyzed how interpretation of online hate varies across cultures and individuals. Nalini et al. reviewed
advancements in ofensive language detection, including transformer-based methods, and provided
detailed experimental comparisons.</p>
      </sec>
      <sec id="sec-2-6">
        <title>2.6. Recent Advances in Dravidian Shared Tasks</title>
        <p>Recent workshops have introduced diverse ofensive and hate speech detection tasks in Dravidian
languages. Sharma et al. [9] explored hope speech detection using ensemble models. Sathvik and Sonani
studied religious hate speech detection in Karnataka. Sreeja and Bharathi investigated multimodal
hate speech detection, showing the importance of combining text with other modalities. Santhiya et al.
benchmarked classical machine learning models for Kannada–Tulu ofensive classification, providing
one of the first studies in this under-resourced language pair. Anusha et al. [ 10] recently focused on
overcoming low-resource barriers in Tulu using neural methods and corpus creation.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Materials and Methods</title>
      <sec id="sec-3-1">
        <title>3.1. Task Description</title>
        <p>The aim of this study is to automatically classify social media comments written in Kannada and Tulu
into diferent types of ofensive and non-ofensive content. The comments were collected from YouTube
and manually annotated. Each comment belongs to one of the following six categories:
• Ofensive
• Non-Ofensive
• Ofensive Untargeted
• Ofensive Targeted Insult Group
• Ofensive Targeted Insult Individual
• Ofensive Targeted Insult Other</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Dataset</title>
        <p>We used two separate datasets: one for Kannada and another for Tulu, both containing code-mixed
social media comments. Each dataset was divided into training, development, and test sets to support
model training and evaluation.</p>
        <p>• Kannada dataset:</p>
        <p>Training set: 6,218 comments
Development set: 778 comments</p>
        <p>Test set: 778 comments
• Tulu dataset:</p>
        <p>Training set: 2,693 comments
Development set: 578 comments</p>
        <p>Test set: 577 comments</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Preprocessing and Feature Extraction</title>
        <p>Preprocessing was crucial for preparing the text data for accurate classification and consisted of:</p>
        <sec id="sec-3-3-1">
          <title>3.3.1. Text Cleaning</title>
          <p>All comments were cleaned to remove unnecessary noise such as URLs, punctuation, numbers, special
characters, and extra spaces. Both Kannada and English letters were preserved to maintain code-mixed
content. Empty or blank comments were removed from the datasets to avoid errors during training
(Sharma and Patel, [9]).</p>
        </sec>
        <sec id="sec-3-3-2">
          <title>3.3.2. Label Encoding</title>
          <p>The categorical labels representing the types of comments (e.g., Ofensive, Non-Ofensive, Ofensive
Targeted Insult Individual, etc.) were converted into numeric values using label encoding. This allowed
the machine learning models to process and learn from the labels efectively.</p>
        </sec>
        <sec id="sec-3-3-3">
          <title>3.3.3. Feature Extraction</title>
          <p>The cleaned text was transformed into numerical features using TF–IDF (Term Frequency–Inverse
Document Frequency) vectorization. TF–IDF captures the importance of words and word pairs in
the text by considering their frequency across all comments. Both single words (unigrams) and
twoword sequences (bigrams) were included to provide a richer representation of the comments. This
representation allowed the models to learn patterns and relationships between words while reducing
the influence of less informative words.</p>
        </sec>
        <sec id="sec-3-3-4">
          <title>3.3.4. Handling Class Imbalance</title>
          <p>While preprocessing, it was noted that some categories had fewer examples. Although no oversampling
or class weighting was applied in this baseline study, this imbalance was considered when interpreting
model performance and will be addressed in future work.</p>
        </sec>
        <sec id="sec-3-3-5">
          <title>3.3.5. Summary of Prepared Data</title>
          <p>Kannada dataset: 6,218 training comments, 778 development comments, 778 test comments. Tulu
dataset: 2,693 training comments, 578 development comments, 577 test comments</p>
          <p>This preprocessing pipeline ensured that the text data was clean, structured, and ready for machine
learning, providing a solid foundation for the models to classify abusive and non-abusive content
efectively.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results and Discussion</title>
      <sec id="sec-4-1">
        <title>4.1. Performance Metrics</title>
        <p>Model performance was evaluated using Accuracy, Precision, Recall, and F1-Score. Accuracy measures
overall correctness, while Precision indicates correctly predicted ofensive texts. Recall reflects the
proportion of actual ofensive texts identified, and F1-Score balances Precision and Recall, which is
crucial for imbalanced datasets. These metrics help assess real-world efectiveness and optimize abuse
detection systems. Given the linguistic complexity of Kannada and Tulu, they provide insights into
model adaptability across code-mixed text patterns. Table 4 illustrates classification performance
for Kannada, while Table 5 presents results for Tulu, ensuring robust and reliable ofensive content
detection models.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Model Performance Analysis</title>
        <p>The performance of the model was analyzed using the above metrics. The report of the models are
shown below(fig 1,2).</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Error Analysis</title>
      <p>To evaluate the limitations of the models, an error analysis was conducted, including both quantitative
and qualitative investigations of misclassified samples. Confusion matrices for Kannada and Tulu
datasets (Figures 1 and 2) were examined to identify recurring error patterns.</p>
      <sec id="sec-5-1">
        <title>5.1. Qualitative analysis</title>
        <p>To better understand the model behavior, we analyzed specific examples:
• Correct Predictions Kannada: “Ninna kelasa neenu madoke baralla” (direct insult) → correctly
classified as ofensive. Tulu: “Yen dodda maga!” (derogatory phrase) → correctly flagged as
ofensive by the ensemble model.
• Incorrect Predictions Kannada: “Super maga, neenu mass” (praise using slang) → misclassified
as ofensive due to strong sentiment words. Tulu: “Encha barpuga?” (sarcastic, indirect) →
predicted as non-ofensive because sarcasm is context-driven. Mixed: “Ninge ondhu respect illa
bro” → misclassified due to English code-mixing.</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Quantitative Analysis</title>
        <p>We evaluated the models using Accuracy, Precision, Recall, and F1-score.</p>
        <p>Kannada Dataset:
Tulu Dataset
• SVM: Achieved the highest accuracy (78.6%) and balanced F1-score, making it robust for
surfacelevel ofensive detection.
• Logistic Regression: Performed slightly lower, with strong precision but weaker recall, often
missing subtle abusive cases.
• Naïve Bayes: Achieved competitive results but sufered from high false positives due to its
probabilistic assumptions. It performed well on balanced examples but occasionally misclassified
minority-class abusive comments due to class imbalance.
• The Ensemble model (Voting Classifier with LR, SVM, NB) outperformed individual classifiers,
achieving 80.2
• However, minority classes like indirect ofensive content had recall below 60
• TF–IDF was efective for direct insults but less efective for sarcasm or indirect abuse.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>This study provides a comparative evaluation of classical machine learning models for detecting abusive
Kannada–Tulu code-mixed comments. Logistic Regression performed best for Kannada (accuracy 0.58),
while an Ensemble Voting Classifier achieved the highest performance for Tulu (accuracy 0.79). Our
results highlight the limitations of TF–IDF and linear models, which struggle with minority classes,
code-mixing, and subtle abuse such as sarcasm. Nonetheless, these models are lightweight, scalable,
and suitable for deployment in low-resource environments, consistent with observations from prior
work on lightweight baselines for Dravidian languages</p>
      <p>Practically, this work can serve as a baseline for social media moderation systems in under-resourced
Dravidian languages, assisting platforms in reducing online abuse and toxicity. Similar eforts in Tamil
and Malayalam demonstrate the broader societal relevance of such systems.</p>
      <p>For future work, we aim to explore transformer-based architectures such as IndicBERT, MuRIL, and
XLM-R, which have shown superior contextual understanding in ofensive language tasks. We will also
investigate data augmentation strategies, e.g., SMOTE and cost-sensitive learning, building upon prior
work that emphasizes class imbalance handling in Dravidian datasets.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Limitations</title>
      <p>While our approach demonstrates promising results in detecting abusive content in Kannada and Tulu
low-resource code-mixed languages, several limitations remain:</p>
      <p>Data imbalance: Both Kannada and Tulu datasets exhibit uneven class distributions, which led
to biased predictions toward majority classes. For instance, Logistic Regression (Kannada) and the
ensemble model (Tulu) often favored dominant categories, reducing sensitivity to minority ofensive
types.</p>
      <p>Code-mixed complexity: The presence of spelling variations, transliteration practices, and
grammar inconsistencies across Kannada–English and Tulu–English text posed major challenges. Since
TF–IDF–based models (SVM, LR, NB) rely heavily on surface-level features, they struggled to generalize
across these variations.</p>
      <p>Generalizability Models trained on the given datasets may not transfer well to unseen social media
platforms or new domains. Shifts in language usage, emerging slang, or diferent user communities
could significantly reduce performance.</p>
      <p>Feature Representation constarints: Relying solely on TF–IDF restricts the ability to capture
deeper semantic and contextual information. Subtle or implicit abusive language, where context is
critical, was particularly dificult for the models to detect.</p>
      <p>Computational vs Performance trade-of: Although models such as LR, SVM, and NB are
lightweight and computationally eficient, they achieve lower accuracy ceilings compared to
transformerbased approaches. More advanced models like BERT, RoBERTa, or XLM-Roberta could capture richer
contextual cues but demand higher resources, which may not be practical in all settings.</p>
    </sec>
    <sec id="sec-8">
      <title>Project Repository</title>
      <p>The full source code for this project is available on
GitHub: GitHub Repository - archna-v</p>
    </sec>
    <sec id="sec-9">
      <title>Declaration on Generative AI</title>
      <p>In the course of preparing this manuscript, the author(s) employed the generative AI tool ChatGPT. Its
use was limited to performing checks for grammar and spelling. Following this, the author(s) conducted
a thorough review and revision of the text and assume full responsibility for the final published content.
[7] B. R. Chakravarthi, R. Priyadharshini, V. Muralidaran, N. Jose, S. Suryawanshi, E. Sherly, J. P.</p>
      <p>McCrae, Dravidiancodemix: Sentiment analysis and ofensive language identification dataset for
dravidian languages in code-mixed text, Language Resources and Evaluation 56 (2022) 765–806.
[8] A. Nandi, K. Sarkar, A. Mallick, A. De, A survey of hate speech detection in indian languages,</p>
      <p>Social Network Analysis and Mining 14 (2024) 70. doi:10.1007/s13278-024-01223-y.
[9] D. Sharma, V. Gupta, V. K. Singh, Stop the hate, spread the hope: An ensemble model for hope
speech detection in english and dravidian languages, ACM Transactions on Asian and
LowResource Language Information Processing (2025). Accepted.
[10] A. M. D, D. Vikram, B. R. Chakravarthi, P. R. Hegde, Overcoming low-resource barriers in tulu:
Neural models and corpus creation for ofensive language identification, 2025. URL: https://arxiv.
org/abs/2508.11166. arXiv:arXiv:2508.11166.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>N.</given-names>
            <surname>Sripriya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Durairaj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Bharathi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. C.</given-names>
            <surname>Navaneethakrishnan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Kumaresan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. D.</given-names>
            <surname>Anusha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. R.</given-names>
            <surname>Hegde</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Vikram</surname>
          </string-name>
          ,
          <article-title>Overview of the shared task on ofensive language identiifcation in dravidian code-mixed languages, in: Forum of Information Retrieval and Evaluation (FIRE-</article-title>
          <year>2025</year>
          ),
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M. N.</given-names>
            <surname>Hasan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. S.</given-names>
            <surname>Sakib</surname>
          </string-name>
          , T. T. Preeti,
          <string-name>
            <given-names>J.</given-names>
            <surname>Allohibi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Alharbi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uddin</surname>
          </string-name>
          ,
          <article-title>Olf-ml: An ofensive language framework for detection, categorization, and ofense target identification using text processing and machine learning algorithms</article-title>
          ,
          <source>Mathematics</source>
          <volume>12</volume>
          (
          <year>2024</year>
          )
          <article-title>2123</article-title>
          . doi:
          <volume>10</volume>
          .3390/math12132123.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D. P.</given-names>
            <surname>Manukonda</surname>
          </string-name>
          , bytellm@lt-edi
          <article-title>-2024: Homophobia/transphobia detection in social media comments - custom subword tokenization with subword2vec and bilstm</article-title>
          ,
          <source>in: Proceedings of the Third Workshop on Speech and Language Technologies for Dravidian Languages</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>M. M. Rahman</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. M. Uddin</surname>
            ,
            <given-names>M. S.</given-names>
          </string-name>
          <string-name>
            <surname>Arefin</surname>
          </string-name>
          , Cuet_ignite@dravidianlangtech
          <year>2025</year>
          :
          <article-title>Detection of abusive comments in tamil text using transformer models</article-title>
          ,
          <source>in: Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages</source>
          ,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. B.</given-names>
            <surname>Jagadeeshan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Palanikumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Priyadharshini</surname>
          </string-name>
          ,
          <article-title>Ofensive language identification in dravidian languages using mpnet and cnn</article-title>
          ,
          <source>International Journal of Information Management Data Insights</source>
          <volume>3</volume>
          (
          <year>2023</year>
          )
          <article-title>100151</article-title>
          . URL: https://www.sciencedirect.com/science/article/ pii/S2667096822000945. doi:https://doi.org/10.1016/j.jjimei.
          <year>2022</year>
          .
          <volume>100151</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Priyadharshini</surname>
          </string-name>
          , N. Jose, T. Mandl,
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Kumaresan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ponnusamy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>McCrae</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Sherly</surname>
          </string-name>
          , et al.,
          <article-title>Findings of the shared task on ofensive language identification in tamil, malayalam, and kannada</article-title>
          ,
          <source>in: Proceedings of the first workshop on speech and language technologies for Dravidian languages</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>133</fpage>
          -
          <lpage>145</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>