=Paper=
{{Paper
|id=Vol-3199/paper3
|storemode=property
|title=LAHM : Large Annotated Dataset for Multilingual & Multi-Domain Hate Speech Identification
|pdfUrl=https://ceur-ws.org/Vol-3199/paper3.pdf
|volume=Vol-3199
|authors=Ankit Yadav,Shubham Chandel,Sushant Chatufale,Anil Bandhakavi
|dblpUrl=https://dblp.org/rec/conf/aaai/YadavCCB22
}}
==LAHM : Large Annotated Dataset for Multilingual & Multi-Domain Hate Speech Identification==
LAHM : Large Annotated Dataset for Multi-Domain
and Multilingual Hate Speech Identification
Ankit Yadav, Shubham Chandel, Sushant Chatufale and Anil Bandhakavi
Logically.ai, Brookfoot Mills, Brookfoot Industrial Estate, Brighouse, HD6 2RW, United Kingdom
Abstract
Current research on hate speech analysis is typically oriented towards monolingual and single classi-
fication tasks. In this paper, we present a new multilingual hate speech analysis dataset for English,
Hindi, Arabic, French, German and Spanish languages for multiple domains across hate speech - Abuse,
Racism, Sexism, Religious Hate and Extremism. To the best of our knowledge, this paper is the first
to address the problem of identifying various types of hate speech in these five wide domains in these
six languages. In this work, we describe how we created the dataset, created annotations at high level
and low level for different domains and how we use it to test the current state-of-the-art multilingual
and multitask learning approaches. We evaluate our dataset in various monolingual, cross-lingual and
machine translation classification settings and compare it against open source English datasets that we
aggregated and merged for this task. Then we discuss how this approach can be used to create large scale
hate-speech datasets and how to leverage our annotations in order to improve hate speech detection and
classification in general.
Keywords
hate speech, multilingual, multi-domain,cross-lingual, racism, religious hate, sexism, abuse, extremism,
few shot learning, zero shot learning,
1. Introduction
Abusive language is an important and relevant issue in social media platforms such as Twitter.
Social media is often exploited to propagate toxic content such as hate speech or other forms of
abusive language. The amount of user-generated content produced every minute is very large,
and manually monitoring abusive behavior in Twitter is not feasible and impractical. Twitter
has made efforts to eliminate abusive content from their platform by providing clear policies
on hateful conduct, user reporting and using moderators to filter content. Still, these manual
efforts are not scalable enough and are not long term.
Several studies from the Natural Language Processing (NLP) field have been done to tackle the
problem of hate speech detection in social media. Most studies proposed a supervised approach
to detect abusive content automatically using various models ranging from traditional machine
learning approaches to deep learning based approaches. However, the majority of work focused
only on a single language, i.e., English, and a single abusive domain phenomenon, e.g., hate
De-Factify: Workshop on Multimodal Fact Checking and Hate Speech Detection, co-located with AAAI 2022. 2022
Vancouver, Canada
$ ankit.yadav@logically.co.uk (A. Yadav); Shubham1_c@logically.co.uk (S. Chandel); sushant.c@logically.co.uk
(S. Chatufale); anil@logically.co.uk (A. Bandhakavi)
https://www.logically.ai/team/leadership/anil-bandhakavi (A. Bandhakavi)
© 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
CEUR
Workshop
Proceedings
http://ceur-ws.org
ISSN 1613-0073
CEUR Workshop Proceedings (CEUR-WS.org)
speech, sexism, racism, religious hate and so on, rather than multiple languages and multiple
domains. Twitter supports content in 34 languages and user can use any one of them to express
views. Thus the problem of tackling hateful content in real time in multiple languages becomes
a challenge. We need robust models for hateful content detection across multiple languages and
multiple domains.
In this paper we try to tackle two prominent challenges in hate speech detection-
1. Build a multilingual dataset for hate speech detection across 6 languages -: English,
Hindi, French, Arabic, German, Spanish.
2. Build a multi-domain dataset that covers these hate speech domains -: Abusive, Racism,
Sexism, Religious Hate and Extremism
We define the different domains as follows-:
1. RACISM: Discrimination based on race, ethnicity, caste, nationality, culture, skin colour,
hair texture, physical aspects.
2. SEXISM: Discrimination based on gender/sexual orientation.
3. RELIGIOUS HATE: Religious discrimination treating a person or group differently because
of the particular faith/belief which they hold about a religion.
4. ABUSE: Speech that causes or likely to cause distress, disrespect or mental pain, especially
from vulgar and profane comments.
5. EXTREMISM: Speech that cause or is likely to cause, harm to individuals, communities
or wider society, and where any political, civil issues can lead to extremist behaviour
through violence.
2. Related Work
There have been several studies on abusive language detection [1] [2], offensive language, hate
speech identification[3], toxicity [4], hatefulness [5], aggression [6], attack [7], racism, sexism
[8], obscenity, threats, and insults.
Along with that, there are several shared tasks that have focused on abusive language and
hate speech detection such as HASOC-2019 [9], TRAC shared task on aggression identification
[6], HatEval [10] and GermEval-2018 [11]which focused on offensive language identification in
German tweets.
Waseem[8] proposed the following list to identify hate speech. Their criteria are partially
derived by negating the privileges observed in McIntosh (2003), where they occur as ways to
highlight importance, ensure an audience, and ensure safety for white people, and partially
derived from applying common sense. A tweet is categorized as offensive if it:
1. uses a sexist or racial slur.
2. attacks a minority.
3. seeks to silence a minority.
4. criticizes a minority (without a well founded argument).
5. promotes, but does not directly use, hate speech or violent crime.
6. criticizes a minority and uses a straw man argument.
7. blatantly misrepresents truth or seeks to distort views on a minority with unfounded
claims.
8. shows support of problematic hash tags - “#BanIslam”, “#whoriental”, “#whitegenocide”
9. negatively stereotypes a minority.
10. defends xenophobia or sexism.
11. contains a screen name that is offensive, as per the previous criteria, the tweet is ambiguous
(at best), and the tweet is on a topic that satisfies any of the above criteria.
Wide variety of machine learning models have been used to deal with multi-domain hate
speech detection task. Some studies use traditional machine learning approaches such as, logistic
regression [12], support vector machine [13] [14], linear support vector machine classifiers
(LSVC) [15] [16] [17]. They used it for better explainability, along with several deep learning
based models, including convolutions neural networks [18] [19], LSTM [20] [18] [17] [16] [21],
bidirectional LSTM [22]. Most recent works focus on transfer learning and novel architectures
involving Transformers based models such as Bidirectional Encoder Representations from
Transformers (BERT) [23] [24] [25] [26] [27] and its variants like RoBERTa [25] in the cross-
domain abusive language detection task.
For cross-lingual abusive language detection, most studies utilized transformers based models.
Some traditional models used such as logistic regression [28] [10] [29], linear support vector
machines [16] [17], SVM [30], LSTM [31] [29] and Bi-LSTMs [31] have also been used. Recent
work focused on several transformer based architectures such as multilingual BERT [32] [28] [25]
[17] [33] [34] [29], RoBERTa [35], XLM [31] [34] and XLM-RoBERTa [35] [25] [36]. Transformers
based models with multilingual language representations can easily deal with language shift in
zero-shot cross-lingual task.
3. LAHM Dataset
In this section, we describe the characteristics that we want to include in our dataset, our
approach to collect different types of hate speech while covering all major hate speech domains
and how to annotate data at large scale. We also give detailed statistics and analysis for the
collected data.
3.1. Essentials of LAHM dataset
Considering no such dataset is available at present that covers these five domains of hate speech
in the six languages, our aim is to create a reliable multilingual and multifaceted hate speech
dataset.
Multilingual data: Our dataset is created as multilingual resource to facilitate cross-lingual
research. It contains hate speech in English, Hindi, French, Arabic, German and Spanish
languages.
Figure 1: Overall Annotation Pipeline Architecture
Multi-Domain data: Our dataset consists of fine-grained labels for each hate sample per
language. These aspects cover majority of hate domains such as racism, sexism, abuse, religious
hate and extremism.
3.2. Dataset Collection
3.2.1. Language and Domain Specific Keywords Collection
Considering the cultural differences in the main regions where English, Hindi, French, Arabic,
Spanish and German are spoken, we start by looking for the hateful keywords that are native to
these languages. We use HateBase vocabulary dataset, which is a valuable lexicon for creating
hateful dataset from public forums, as well as Hatebase’s sightings dataset, which is useful for
trending analysis of keywords.
We targeted 6 languages. Additionally we tried some indic languages mainly Marathi and
Bangla but hatebase has negligible coverage for these languages in terms of keywords. Addi-
tionally, we extracted the following:
1. Targeted groups for the keywords
2. Offensive levels of keywords - Extremely, Highly and Mildly offensive.
3. Recent sighting counts
Target groups help us categorise keywords in low level classes of ethnicity & nationality,
religious hate, gender & sexual orientation and general abuse. Details of keywords per domain
and language are provided in Table 1 in dataset statistics. Total keywords collected were 1022 .
To extract data for extremism, we used a set of keywords related to extremism and terrorism
(including terrorist organisation names) to retrieve news articles from BBC Monitoring1 .
1
https://monitoring.bbc.co.uk/
1. We further extracted comments from those articles related to extremism and terror-
ism. We also extracted tweets and comments which were extremist in nature from the
counterextremism database2 .
2. The comments extracted were said by members and suspects of terrorist and extremist
organisations on Facebook, Twitter and YouTube.
3.2.2. Large Scale Multilingual Dataset of Tweets
We started our dataset collection by using the keywords built per language. We utilize twint3
API to collect 1000-2000 tweets per keyword as Twitter official API can only be utilized for
limited number of requests. We searched through the API for last one year of data. We also add
additional keywords from MLMA [1] for Arabic and French.
The collected raw data contains cross-lingual tweets and therefore language detection be-
comes a part of our process. For each keyword in each language we consider tweets only in
that language and drop the rest. This helps us in training monolingual hate speech detection
without any need to worry for code switching in languages.
We substituted all usernames with @USER and urls with @URL and cleaned any unnecessary
symbols from the tweet. We also discarded short tweets with less than 4 words. In total we
collected 497660 tweets. Details of tweets per language is given in section 3.3 Dataset Statistics.
For extremism class we carefully handcrafted a total of 88 keywords and collected the data
from BBC monitoring. Additional data was collected from other websites for extremism.
3.3. Dataset Statistics
Table 1 gives us the distribution of keywords across different languages and different domains.
We have merged classes from hate base to arrive at those domains. Sexism contains keywords
that belong to gender and sexual orientation. Racism includes classes of ethnicity and nationality.
Religious Hate includes words belonging to Religion category on Hatebase 4 . Extremism
keywords were handcrafted as described in section 3.2.
English was the dominant language in terms of the total keywords (500) extracted. Besides
Hindi, racism is the major domain for which all languages have highest number of hate keywords.
3.4. HSMerge Dataset
3.4.1. Dataset Preparation
In order to build the dataset for different tasks with gold labels, we utilize 10 publicly available
datasets for different types (tasks) of hate speech detection. We sampled annotated examples
from diverse English datasets: GAO hatefulness [5], TRAC aggression [6], offensive [37], racism
and sexism [8], MLMA [1], attack [7], CONAN [38], MMHS150k [39].
We map labels from these datasets into different domains: abusive, non-abusive, sexism,
racism and religious hate. For OLID [37] OFFENSIVE maps to Abuse. GAO and WUL had binary
2
https://www.counterextremism.com/daily-dose-archive
3
https://github.com/twintproject/twint/
4
https://hatebase.org/
Table 1
Keywords language & domain distribution
Language Sexism Racism Abuse Religious Hate Extremism
English 69 418 30 33 88
Hindi 75 16 9 8 -
Arabic 27 35 17 17 -
French 13 85 13 9 -
German 21 67 13 11 -
Spanish 43 77 21 2 -
Total 248 698 103 80 88
Table 2
Tweets distribution
Languages Raw Processed
English 184339 105120
Hindi 59321 32734
Arabic 27374 5394
French 75126 20809
German 30868 8631
Spanish 120632 55148
Total 497660 227836
labels, while the original TRAC uses three labels: non-aggressive, covertly-aggressive, and
openly-aggressive. We relabel the first as non-abusive, and the other two as abusive. For MLMA
we create labels on the basis of target groups: target group RELIGION becomes RELIGIOUS
HATE. For MLMA [1] target group GENDER and SEXUAL ORIENTATION maps to SEXISM,
target group RELIGION becomes RELIGIOUS HATE, target group ORIGIN becomes RACISM. For
HASOC [9] OFFENSIVE maps to ABUSE. For COnan[40] target groups MUSLIMS, JEWS becomes
RELIGIOUS HATE labels, target groups MIGRANTS, POC maps to RACISM, target group WOMEN,
LGBT+ maps to SEXISM and DISABLED maps to general ABUSE group. For MMHS150K [39] we
got RACISM, SEXISM, RELIGIOUS HATE & OTHERHATE.
For some of these data we had to hydrate tweets from Twitter using official Twitter API as
only the tweet ids had been provided. For Waseem and Founta [38], majority of the tweets were
not available on Twitter which led to less number of tweets compared to the official dataset. We
standardized these datasets into a single HSMerge dataset. Details of classes and total samples is
available in Table 3.
Table 3
HSMerge open source gold dataset
Datasets Domain Hate Non-Hate Total
OLID Offensive / NotHate 4400 8840 13240
COnan Racism/ Sexism/ Religious/ Abuse/ OtherHate 938/1025/1921/175/179 - 4238
TRAC Hateful / NotHate 5993 4348 10341
MLMA Racism/ Sexism/ Religious/ OtherHate 2448 / 1152 / 68/1979 - 5647
MLA150K Racism/ Sexism/ Religious/ OtherHate/ NotHate 11925 / 7365 / 163 / 5811 112845 138209
HASOC Hateful / NotHate 2261 3591 5852
WASEEM Racist/Sexist/NotHate 2753/15 7640 10408
WUL Hate/NotHate 8817 62937 71754
GAO Hate / NotHate 244 675 909
4. Experiments
4.1. Overall Methodology and Models
In this section, we describe the various components of our pipeline as shown in figure 1 and
the models used for semi-supervised annotation process in different settings. Most of the hate
speech detection tasks depend on manual annotation that limits the number of samples that can
be labeled. We used a hierarchical approach to validate and refine our initial keyword based
dataset from Twitter.
1. Monolingual hate speech detection models from Hate-ALERT used as zero shot pipeline
to get high level labels of Hate or NoHate in monolingual settings.
2. Multilingual mbert fine-tuned model to get high level labels.
3. Translate the raw data of five languages (other than English) to English.
4. Use multilingual binary classifier (developed in step 2) on translated data to classify tweets
at high level.
5. Utilize Google Perspective API to predict toxicity of tweets for high level labels.
6. Use two way voting to predict final binary labels.
7. Fine-tuned "distilbert-base-uncased" model on only the English dataset, and used it to do
annotations for the English translations of the data collected from Twitter in 5 languages.
8. To validate our domain specific labels we trained various models on HSMerge data shown
in Table 3 and predicted on hateful samples obtained from step 6.
4.2. Experimental Setting
To evaluate the performance of the models, we used weighted average F1 for benchmarking on
validation set. All the experiments were done on NVIDIA A100 GPU with up to 20 GiB RAM.
4.3. Hatefulness (Hate/No-Hate)
We utilised hate speech models to perform initial validation on the raw LAHM dataset; details
can be found in Section 5.
4.3.1. Monolingual Experiments
We utilized BERT based language specific hate speech models from HuggingFace. The models
used were trained, validated and tested on the same language.
We utilized the following language models for our experiments:
1. Hate-speech-CNERG/dehatebert-mono-english 5
2. l3cube-pune/hate-multi-roberta-hasoc-hindi6
3. Hate-speech-CNERG/dehatebert-mono-french7
4. Hate-speech-CNERG/dehatebert-mono-german8
5. Hate-speech-CNERG/dehatebert-mono-spanish9
Binary hate labels distribution results from monolingual experiments are presented in table 8
4.3.2. Machine Translation Experiments
We adopted the methodology to utilize open source machine translation models for translation
of English data to multilingual data and vice-versa. To select the best machine translation model
for each of our languages, we evaluated a number of models on a small manually annotated
dataset. The translations were carried out for each of the these languages: Hindi, Arabic, French,
German and Spanish. Two types of translations were carried out using the translation models:
1. Translation of English dataset collected from various open source hate speech datasets.
All the English language samples in this dataset were translated to other 5 languages.
2. Translation of multilingual data collected from Twitter. This data was collected for the
above 5 languages, and each of this language data was translated into corresponding
English language data.
Consideration for choosing and evaluating the translation models was based on whether they
were open source, free/easy to use, and the translation quality. Translation models used were:
1. Google sheets translation
2. m2m-100-1.2B10
3. IndicTrans11
The evaluation metrics used were bleu, rougeL and semantic similarity. For semantic similar-
ity, all-mpnet-base-v2 model from sentence-transformers library was used to calculate the cosine
score between input and translated sentence embeddings. Figure 4.3.2 shows comparisons
of bleu, rougeL and semantic similarity scores for the 3 models on different languages. After
comparing the performances and taking other considerations into account, indicTrans model
was selected for all translations of Indic languages (Hindi), and m2m-100-1.2B was selected for
the other languages.
5
https://huggingface.co/Hate-speech-CNERG/dehatebert-mono-english
6
l3cube-pune/hate-roberta-hasoc-hindi
7
https://huggingface.co/Hate-speech-CNERG/dehatebert-mono-french
8
https://huggingface.co/Hate-speech-CNERG/dehatebert-mono-german
9
https://huggingface.co/Hate-speech-CNERG/dehatebert-mono-spanish
10
https://github.com/UKPLab/EasyNMT
11
https://github.com/AI4Bharat/indicTrans
Figure 2: bleu scores for model translations
Figure 3: rougeL scores for model translations
Figure 4: Semantic similarity scores for model translations
4.3.3. Binary Multilingual Experiments
We utilized mbert which is a pre-trained model on the top 104 languages with the largest
Wikipedia corpus using a masked language modeling (MLM) objective, to fine-tune for binary
classification task to perform few shot learning experiments. The aim was to make the classifier
learn over samples from a few languages and test over other languages. The training dataset
was predominant in English, curated from multiple open source resources. In addition to that
we introduced language samples for a few other languages using a neural machine translation
based open source model (m2m_100_1.2B)12 . NMT is an approach to achieve machine translation
using artificial neural nets to predict the best possible sequence of words.
The model was trained, tested and validated on the custom multilingual dataset curated with
the class distribution shown in Table 4 per chosen language. We utilized this model to make
predictions on the curated dataset.
Table 4
Multilingual binary classifier training data
Languages Hate NoHate Total
English 20620 18171 38791
Arabic 20539 18138 38680
Hindi 20541 18138 38679
French 20542 18138 38680
Total 82242 72584 154826
4.3.4. Perspective API Experiments
Perspective API is a free to use API that uses machine learning to identify toxicity in the
comments. It has API rate limitations of 1 request per second. Due to this limitation we utilized
it on all languages except English. We utilized Google Cloud to generate the Perspective API
key for sending requests, and specifically CommentAnalyzer API to get the toxicity score for
each data point. The CommentAnalyzer API supports all the 5 languages we have considered.
We used the Twitter cleaned data we had collected for the 5 languages to get the Toxicity score
to classify each tweet as hate or not-hate. Numbers are reported in Table 8.
4.4. Multi-Domain Experiments
This section describes the multi-domain experiments to validate labels on the LAHM dataset.
4.4.1. Machine Translation Based Multi-Domain Experiments
HSMerge data present in Table 3 was used for the multi-domain classification of hate speech. This
dataset was pre-processed by removing URLs, mentions/usernames, null values and duplicates.
The aim was to utilize transfer learning to fine-tune a model on only the English dataset, and
use it to do annotations for the English translations of the data collected from Twitter in other
5 languages.
12
https://github.com/pytorch/fairseq/tree/main/examples/m2m1 00
The model used for fine-tuning on this dataset was "distilbert-base-uncased" 13 . DistilBERT is
a distilled version of the BERT base model. It has 40 percent less parameters than bert-base-
uncased, is 60 percent faster and retains almost 97 percent of BERT’s performances14 . A linear
layer on top of the pooled output of the model was added for the multi-class classification.
During training, the maximum sequence length was limited to 256, batch size was set to 32,
learning rate 5e-5 and was trained for 4 epochs. The dataset was split into train, validation and
test set using stratified random sampling. Dataset details :
1. 20,174 training samples
2. 5,044 validation samples
3. 6,305 test samples
The model evaluation metrics on the test set are given in Table 5.
Table 5
Translated multiclass metrics
Domain precision recall f1-score support
racism 0.94 0.95 0.94 2882
sexism 0.93 0.93 0.93 2295
religious hate 0.92 0.93 0.92 409
abuse 0.94 0.91 0.93 589
extremism 0.83 0.82 0.82 130
accuracy 0.93 6305
macro avg 0.91 0.91 0.91 6305
weighted avg 0.93 0.93 0.93 6305
4.4.2. Cross lingual based Multi-Domain Experiments
To validate our fine-grained labels with keywords from hatebase, we experimented with cross-
lingual models trained on HSmerge dataset as shown in Table 3. In this experiment setting
we fine tuned the mbert model for the multi-class classification task and trained the model on
HSmerge data in Table 3 and do the prediction on the other languages. Empirical results can be
found under Table 7. In this zero-shot setting, no other language samples were given to the
model after fine-tuning on the English dataset and also machine translation was not involved at
all in either the pre-training or fine-tuning.
We couldn’t manage to maintain the class distribution, and the extremism class was left with
comparatively less data points. To counter the issue we introduced custom class weights to
provide equal attention to the minority class which in our case was extremism. This was done
using a weighted class random sampler WeightedSampler for class imbalance, so that all classes
have equal probability. The empirical scores achieved by the cross-lingual model can be found
in the Table 7.
13
https://huggingface.co/distilbert-base-uncased
14
https://arxiv.org/abs/1910.01108
The open source mbert model we fine-tuned for our use case has previously been tested for
zero-shot experimentation15 and did manage to achieve decent empirical scores shown in Table
6. Dataset details :
1. 19298 training samples
2. 6432 validation samples
3. 6434 test samples
Table 6
Bert zero shot gold F1 scores
Model English Spanish German Arabic
BERT-Zero Shot 81.4 74.3 70.5 62.1
Table 7
HSMerge cross-lingual metrics
Domains precision recall f1-score support
racism 0.93 0.95 0.94 2161
sexism 0.93 0.91 0.92 1722
religious hate 0.93 0.93 0.93 307
abuse 0.93 0.92 0.92 442
extremism 0.85 0.80 0.83 97
accuracy 0.93 4729
macro avg 0.91 0.90 0.91 4729
weighted avg 0.93 0.93 0.93 4729
5. Evaluation
We performed initial level of validation (Hatefulness) experiments on the raw LAHM dataset
using open source monolingual models, Perspective API and the multilingual binary classifier
we trained. We compared the results for monolingual and multilingual and noticed that our
multilingual few-shot learning based classifier out-performed the open source monolingual
BERT based models from HuggingFace except on Arabic and French languages where the latter
did a better job; details can be found in Table 8. Based on the analysis we took at least 2 votes
from the models which performed better, details shown in Table 9.
For Hindi, Perspective API performed better by predicting 31 percent as hate compared to 12
percent for binary classifier. For Arabic, the binary classifier performed significantly better.
For domain validation we utilised the zero-shot cross-lingual, multi-domain and machine
translation based multi-domain models trained with HSmerge data in Table 3 and performed
15
https://github.com/google-research/bert/blob/master/multilingual.md#results
Table 8
All models prediction on high level label Hate and NoHate
Monolingual Multilingual Perspective
Language
Hate NoHate Hate NoHate Hate NoHate
English 7247 101345 11380 99598 - -
Hindi 210 260 3970 7247 1674 3720
Arabic 6469 26265 3914 28820 1376 31358
French 9148 24185 8893 24440 7540 13269
German 468 23910 3905 20473 4467 4164
Spanish 2088 10087 4734 7441 24952 30196
Total 25630 186052 36796 188019 40009 82707
Table 9
Voting across multiple models for hate detection
Language Mono-Lingual Multi-lingual Perspective API
English ✓ ✓
Hindi ✓ ✓
Arabic ✓ ✓
French ✓ ✓
German ✓ ✓
Spanish ✓ ✓
Table 10
Cross-lingual multi-domain predictions distribution
Language Abuse Sexism Racism Religious Extremism
English 4228 2934 1566 751 154
Arabic 1895 994 736 164 5631
Hindi 1511 440 282 22 1715
French 813 779 2393 750 4413
German 180 123 30 4 131
Spanish 818 483 139 11 637
Total 9445 5753 5146 1702 12681
Table 11
Machine Translation multi-domain predictions distribution
Language Abuse Sexism Racism Religious Extremism
English 8691 6693 2495 261 524
Arabic 430 1302 209 259 1138
Hindi 132 204 67 137 105
French 745 1004 545 273 538
German 232 608 186 58 98
Spanish 2082 2024 1519 347 1310
Total 12312 11835 5021 1335 3713
predictions on the different languages. Label distribution from cross-lingual models are shown
in Table 10. Label distribution of predictions for different languages on the LAHM dataset from
multi-class translated model is in Table 11.
For Hindi, 31.6 percent samples were Abuse, while the lowest were 10.3 percent for Racism.
For Arabic, Sexism and Extremism contributed to 73 percent of all hate samples. For French, 32
percent were Abuse and lowest 8.3 percent were Religious Hate. For German, 51 percent were
Sexism while the lowest were Religious Hate and Extremism together amounting to 13 percent.
For Spanish, Abuse and Sexism had 27 percent each, while Religious Hate had 5 percent of
samples.
6. Conclusion
We have presented the LAHM dataset, a large scale semi-supervised training dataset for multi-
lingual and multi-domain hate speech identification, we created by using 3 layer annotation
pipeline and combination of monolingual, multilingual and cross-lingual models. To the best
of our knowledge, LAHM is the largest of its kind, containing close to 300k tweets across 6
languages and 5 domains.
LAHM enables cross-lingual abusive language detection across five domains and in-depth
interplay between language shift and domain shift. We have profiled LAHM as a comprehensive
resource for evaluating hate speech detection through a series of cross-domain experiments
in monolingual, multilingual and cross-lingual setups with state of the art transfer learning
models.
We hope that LAHM will inspire more efforts in understanding and building semi-supervised
large scale multilingual and multi-domain abusive language detection datasets.
7. Future Work
For future work we explored leetspeak detection and identification on social networks. A lot of
hate content on social media uses leetspeak to evade moderators and automated systems. We
collected hate keywords belonging in this category for each of the 5 domains, and experimented
with a set of leets for each keyword to extract the leetspeak hate content. We plan to use this
for future work in multilingual and multi-domain settings.
References
[1] N. Ousidhoum, Z. Lin, H. Zhang, Y. Song, D.-Y. Yeung, Multilingual and multi-aspect hate
speech analysis, in: Proceedings of EMNLP, Association for Computational Linguistics,
2019.
[2] Y.-L. Chung, E. Kuzmenko, S. S. Tekiroglu, M. Guerini, Conan–counter narratives through
nichesourcing: a multilingual dataset of responses to fight online hate speech, arXiv
preprint arXiv:1910.03270 (2019).
[3] T. Davidson, D. Warmsley, M. Macy, I. Weber, Automated hate speech detection and the
problem of offensive language, in: Proceedings of the International AAAI Conference on
Web and Social Media, volume 11, 2017.
[4] V. Kolhatkar, M. Taboada, A corpus for the analysis of online news comments (????).
[5] L. Gao, R. Huang, Detecting online hate speech using context aware models, arXiv preprint
arXiv:1710.07395 (2017).
[6] R. Kumar, A. K. Ojha, S. Malmasi, M. Zampieri, Benchmarking aggression identification
in social media, in: Proceedings of the First Workshop on Trolling, Aggression and
Cyberbullying (TRAC-2018), 2018, pp. 1–11.
[7] E. Wulczyn, N. Thain, L. Dixon, Ex machina: Personal attacks seen at scale, in: Proceedings
of the 26th international conference on world wide web, 2017, pp. 1391–1399.
[8] Z. Waseem, D. Hovy, Hateful symbols or hateful people? predictive features for hate
speech detection on twitter, in: Proceedings of the NAACL Student Research Workshop,
Association for Computational Linguistics, San Diego, California, 2016, pp. 88–93. URL:
http://www.aclweb.org/anthology/N16-2013.
[9] T. Mandl, S. Modha, C. Mandlia, D. Patel, A. Patel, M. Dave, Hasoc-hate speech and
offensive content identification in indo-european languages, 2019.
[10] A. Basile, C. Rubagotti, Crotonemilano for ami at evalita2018. a performant, cross-lingual
misogyny detection system., EVALITA Evaluation of NLP and Speech Tools for Italian 12
(2018) 206.
[11] M. Wiegand, M. Siegel, J. Ruppenhofer, Overview of the germeval 2018 shared task on the
identification of offensive language (2018).
[12] J. Salminen, M. Hopf, S. A. Chowdhury, S.-g. Jung, H. Almerekhi, B. J. Jansen, Developing
an online hate classifier for multiple social media platforms, Human-centric Computing
and Information Sciences 10 (2020) 1–34.
[13] S. A. Chowdhury, H. Mubarak, A. Abdelali, S.-g. Jung, B. J. Jansen, J. Salminen, A multi-
platform arabic news comment dataset for offensive language detection, in: Proceedings
of the 12th Language Resources and Evaluation Conference, 2020, pp. 6203–6212.
[14] M. Wiegand, J. Ruppenhofer, A. Schmidt, C. Greenberg, Inducing a lexicon of abusive
words–a feature-based approach (2018).
[15] M. Karan, J. Šnajder, Cross-domain detection of abusive language online, in: Proceedings
of the 2nd workshop on abusive language online (ALW2), 2018, pp. 132–137.
[16] E. W. Pamungkas, V. Patti, Cross-domain and cross-lingual abusive language detection:
A hybrid approach with deep learning and a multilingual lexicon, in: Proceedings of the
57th annual meeting of the association for computational linguistics: Student research
workshop, 2019, pp. 363–370.
[17] E. W. Pamungkas, V. Basile, V. Patti, Misogyny detection in twitter: a multilingual and
cross-domain study, Information Processing & Management 57 (2020) 102360.
[18] J. S. Meyer, B. Gambäck, A platform agnostic dual-strand hate speech detector, in: ACL
2019 The Third Workshop on Abusive Language Online Proceedings of the Workshop,
Association for Computational Linguistics, 2019.
[19] K. Wang, D. Lu, S. C. Han, S. Long, J. Poon, Detect all abuse! toward universal abusive
language detection models, arXiv preprint arXiv:2010.03776 (2020).
[20] A. Arango, J. Pérez, B. Poblete, Hate speech detection is not as easy as you may think:
A closer look at model validation, in: Proceedings of the 42nd international acm sigir
conference on research and development in information retrieval, 2019, pp. 45–54.
[21] Z. Waseem, J. Thorne, J. Bingel, Bridging the gaps: Multi task learning for domain transfer
of hate speech detection, in: Online harassment, Springer, 2018, pp. 29–55.
[22] M.-A. Rizoiu, T. Wang, G. Ferraro, H. Suominen, Transfer learning for hate speech detection
in social media, arXiv preprint arXiv:1906.03829 (2019).
[23] T. Caselli, V. Basile, J. Mitrović, M. Granitzer, Hatebert: Retraining bert for abusive language
detection in english, arXiv preprint arXiv:2010.12472 (2020).
[24] A. Koufakou, E. W. Pamungkas, V. Basile, V. Patti, Hurtbert: incorporating lexical features
with bert for the detection of abusive language, in: Proceedings of the fourth workshop
on online abuse and harms, 2020, pp. 34–43.
[25] G. Glavaš, M. Karan, I. Vulic, Xhate-999: Analyzing and detecting abusive language across
domains and languages, Association for Computational Linguistics, 2020.
[26] M. Mozafari, R. Farahbakhsh, N. Crespi, Hate speech detection and racial bias mitigation
in social media based on bert model, PloS one 15 (2020) e0237861.
[27] K. B. Ozler, K. Kenski, S. Rains, Y. Shmargad, K. Coe, S. Bethard, Fine-tuning bert for
multi-domain and multi-label incivil language detection, in: Proceedings of the Fourth
Workshop on Online Abuse and Harms, 2020, pp. 28–33.
[28] S. S. Aluru, B. Mathew, P. Saha, A. Mukherjee, A deep dive into multilingual hate speech
classification, in: Machine Learning and Knowledge Discovery in Databases. Applied
Data Science and Demo Track: European Conference, ECML PKDD 2020, Ghent, Belgium,
September 14–18, 2020, Proceedings, Part V, Springer International Publishing, 2021, pp.
423–439.
[29] N. Vashistha, A. Zubiaga, Online multilingual hate speech detection: experimenting with
hindi and english social media, Information 12 (2021) 5.
[30] M. O. Ibrohim, I. Budi, Translated vs non-translated method for multilingual hate speech
identification in twitter, Int. J. Adv. Sci. Eng. Inf. Technol 9 (2019) 1116–1123.
[31] M. Corazza, S. Menini, E. Cabrio, S. Tonelli, S. Villata, Hybrid emoji-based masked
language models for zero-shot abusive language detection, in: Findings of the Association
for Computational Linguistics: EMNLP 2020, Association for Computational Linguistics,
Online, 2020, pp. 943–949. URL: https://aclanthology.org/2020.findings-emnlp.84. doi:10.
18653/v1/2020.findings-emnlp.84.
[32] H. Ahn, J. Sun, C. Y. Park, J. Seo, NLPDove at SemEval-2020 task 12: Improving offensive
language detection with cross-lingual transfer, in: Proceedings of the Fourteenth Workshop
on Semantic Evaluation, International Committee for Computational Linguistics, Barcelona
(online), 2020, pp. 1576–1586. URL: https://aclanthology.org/2020.semeval-1.206. doi:10.
18653/v1/2020.semeval-1.206.
[33] J. M. Pérez, A. Arango, F. Luque, ANDES at SemEval-2020 task 12: A jointly-trained BERT
multilingual model for offensive language detection, in: Proceedings of the Fourteenth
Workshop on Semantic Evaluation, International Committee for Computational Linguistics,
Barcelona (online), 2020, pp. 1524–1531. URL: https://aclanthology.org/2020.semeval-1.199.
doi:10.18653/v1/2020.semeval-1.199.
[34] L. Stappen, F. Brunn, B. Schuller, Cross-lingual zero-and few-shot hate speech detection
utilising frozen transformer language models and axel, arXiv preprint arXiv:2004.13850
(2020).
[35] T. Dadu, K. Pant, Team rouges at SemEval-2020 task 12: Cross-lingual inductive transfer
to detect offensive language, in: Proceedings of the Fourteenth Workshop on Semantic
Evaluation, International Committee for Computational Linguistics, Barcelona (online),
2020, pp. 2183–2189. URL: https://aclanthology.org/2020.semeval-1.290. doi:10.18653/
v1/2020.semeval-1.290.
[36] T. Ranasinghe, M. Zampieri, Multilingual offensive language identification with cross-
lingual embeddings, in: Proceedings of the 2020 Conference on Empirical Methods in
Natural Language Processing (EMNLP), Association for Computational Linguistics, Online,
2020, pp. 5838–5844. URL: https://aclanthology.org/2020.emnlp-main.470. doi:10.18653/
v1/2020.emnlp-main.470.
[37] M. Zampieri, P. Nakov, S. Rosenthal, P. Atanasova, G. Karadzhov, H. Mubarak, L. Der-
czynski, Z. Pitenis, Ç. Çöltekin, Semeval-2020 task 12: Multilingual offensive language
identification in social media (offenseval 2020), arXiv preprint arXiv:2006.07235 (2020).
[38] A. M. Founta, C. Djouvas, D. Chatzakou, I. Leontiadis, J. Blackburn, G. Stringhini, A. Vakali,
M. Sirivianos, N. Kourtellis, Large scale crowdsourcing and characterization of twitter
abusive behavior, in: Twelfth International AAAI Conference on Web and Social Media,
2018.
[39] R. Gomez, J. Gibert, L. Gomez, D. Karatzas, Exploring hate speech detection in multimodal
publications, in: Proceedings of the IEEE/CVF Winter Conference on Applications of
Computer Vision, 2020, pp. 1470–1478.
[40] Fanton, Margherita and Bonaldi, Helena and Tekiroğlu, Serra Sinem and Guerini, Marco,
Human-in-the-Loop for Data Collection: a Multi-Target Counter Narrative Dataset to
Fight Online Hate Speech, in: Proceedings of the 59th Annual Meeting of the Association
for Computational Linguistics, Association for Computational Linguistics, 2021.