IIITG-ADBU at HASOC 2019: Automated Hate
   Speech and Offensive Content Detection in
      English and Code-Mixed Hindi Text

          Arup Baruah1 , Ferdous Ahmed Barbhuiya1 , and Kuntal Dey2
                 1
                  Dept. of Comp. Sc. & Engg., IIIT Guwahati, India
                  arup.baruah@gmail.com, ferdous@iiitg.ac.in
            2
              IBM Research India, New Delhi, India, kuntadey@in.ibm.com


        Abstract. This paper presents the results obtained by using Logistic
        Regression (LR), Support Vector Machine (SVM), bi-directional long
        short-term memory (BiLSTM) and Neural Network (NN) models for
        subtask A of the shared task “Hate Speech and Offensive Content Iden-
        tification in Indo-European Languages” (HASOC). This paper presents
        the results for English and code-mixed Hindi language. Embeddings from
        Language Models (ELMo), Glove and fastText embeddings, and TF-IDF
        features of character and word n-grams have been used to train the mod-
        els. Our best models for Hindi and English language obtained F1 score
        of 81.05 and 74.62 respectively on the official run. The models obtained
        the 4th and 8th position in the official ranking.

        Keywords: Hate Speech · Logistic Regression · Support Vector Machine
        · Bi-directional Long Short-Term Memory, Glove, fastText, ELMo


1     Introduction

Social media has made it easier for people to communicate with one another.
Publishing content to reach a vast number of people has become very easy.
However, among the constructive dialogs that take place in social media, there
are also a few negative things that are happening in social media. Content that
is hateful, offensive or profane is also being published. Such content are harmful
for the society. There are evidences where hateful content published via social
media has fueled communal riots in different parts of the world.
    There has been a growing interest among the research communities to use
machine learning and natural language processing techniques to automatically
detect hateful and offensive content. As a step towards this direction, the shared
    Copyright c 2019 for this paper by its authors. Use permitted under Creative Com-
    mons License Attribution 4.0 International (CC BY 4.0). FIRE 2019, 12-15 Decem-
    ber 2019, Kolkata, India.
       A. Baruah et al.

task “Hate Speech and Offensive Content Identification in Indo-European Lan-
guages” (HASOC) has been organized [7]. This paper presents the results ob-
tained by our models for subtask A of HASOC. The goal of subtask A is to
detect if a given tweet is free from hateful and offensive content or not.

2     Related Work
Automated detection of offensive, hateful, abusive, aggressive, and profane text
has seen the use of rule-based, traditional machine learning, and deep learning
techniques. Risch and Krestel [9] used a LR classifier to detect abusive language.
Features such as word and character n-grams, word2vec embeddings, word and
character count etc. were used in the study. Waseem [12] used SVM and LR
classifier to detect racist or sexist content. Nobata et al. [8] used a regression
model to detect abusive content. Djuric et al. [3] used a LR classifier to detect
hate speech. Among other features, this study used comment embeddings as
features. Serra et al. [11] used a character-based RNN to detect hate speech in
tweets. Gamback and Sikdar [4] used a CNN to detect racist and sexist content.
Badjatiya et al. [1] experimented with LR, SVM, Gradient Boosted Decision
Tree (GDBT), CNN, LSTM and FastText based models. Study on hate speech
detection in code-mixed Hindi-English data has been performed in Mathur et
al. [6], Santosh and Aravind [10], and Kamble and Joshi [5].

3     Dataset
The dataset for Subtask A of HASOC has been labeled as either free from
hateful, offensive and profane content or not. Trial, train and test datasets were
released for the subtask. Table 1 below shows the details of the dataset for
both English and Hindi. As can be seen from the table, the percentage of hate,
offensive or profane content was more in the English trial dataset compared
to the English train dataset. For Hindi, the distribution of hate and non-hate
content was identical in both trial and train dataset. The Hindi dataset was
more balanced compared to the English dataset.
    It was observed that performance of the models used in this study improved
when English trial and train datasets were combined for training the models.
However, combining the Hindi trial and train dataset decreased the performance
of the models. Thus, only the train dataset was used for training the models for
Hindi.

4     Methodology
4.1   Preprocessing
We experimented by removing the URLs, hashtags, and mentions from the En-
glish dataset. However, we found that removing each of them degraded the per-
formance of our models. Thus, for our final models the dataset was used as was
provided without performing any preprocessing.
                  Automated Hate Speech and Offensive Content Detection

                           Table 1. Data set statistics


Language Type Not Hate/Offensive/Profane Hate/Offensive/Profane Total
 English Trial        208 (41.19%)             297 (58.81%)      505
 English Train       3591 (61.36%)            2261 (38.64%)     5852
 English Test          Not Known                Not Known       1153
  Hindi  Trial        64 (47.06%)               72 (52.94%)      136
  Hindi  Train       2196 (47.07%)            2469 (52.93%)     4665
  Hindi  Test          Not Known                Not Known       1318

4.2   Word and Sentence Embeddings

In our study, we used Embeddings from Language Models (ELMo), Glove, and
fastText embeddings. The Glove and fastText embeddings were used to train our
BiLSTM model. ELMo was used to train a simple neural network classifier. The
200 dimensional pre-trained Glove embeddings for Twitter dataset was used.
The Glove embeddings were used only for the English language models. The
fastText embeddings were used to train models for both English and Hindi. The
300 dimensional pre-trained fastText embeddings for English and Hindi were
used.
    For ELMo embeddings, we fine-tuned the ELMo module provided by Ten-
sorFlow Hub. This module returns the ELMo embeddings for each word of the
sentence, as well as the vector for the complete sentence. We used the 1024
dimensional vector of the sentence to train a neural network classifier.


4.3   Models

We used the Logistic Regression (LR), Support Vector Machine (SVM), Bi-
directional Long Short-Term Memory (BiLSTM), an ELMo based Neural Net-
work (NN) and an ensemble of the ELMo based NN and character-based LR
classifiers. All the classifiers used are described below:


Logistic Regression: The LR classifier was used for both the English and
Hindi dataset. L2 regularization was used for the classifier. The hyperparameter
C was set to 1.2. The classifier was trained using the TF-IDF features of word
n-grams (1,3), character n-grams (1,6), and combination of word n-grams (1,3)
and character n-grams (1,6).


Support Vector Machine: The SVM classifier was used for both English and
Hindi dataset. The ‘linear’ kernel was used for the classifier. L2 regularization
was used and the hyperparameter C was set to 1.0. The classifier was trained
using the same TF-IDF features as mentioned above for the LR classifier.


Bi-directional Long Short-Term Memory: The BiLSTM model used in
this study is based on the architecture from Baruah et al. [2]. The architecture
       A. Baruah et al.

of the model is shown in Fig. 1. It consisted of a BiLSTM layer and two Dense
layers. The BiLSTM layer has 100 units and used a recurrent dropout of 0.10.
A dropout of 0.25 was applied to the output of this layer. Global max pooling
was applied on the output of the BiLSTM layer. The Dense layer that followed
had 100 units and it used the ReLU activation function. A dropout of 0.25 was
applied to the output of this layer also. The final Dense layer had 1 unit and the
sigmoid activation function was used for this layer. The Adam optimizer and the
binary cross-entropy loss function was used for training.


                             Fig. 1. BiLSTM model


   The model has been trained using 200 dimensional Glove embeddings, 300
dimensional English fastText embeddings, and 300 dimensional Hindi fastText
embeddings.


ELMo based Neural Network: The architecture of the ELMo based neural
network is shown in Fig. 1. It consisted of an ELMo embedding layer and two
Dense layers. The first Dense layer had 256 units and used the ReLU activation
function. The next Dense layer had 1 unit and used the sigmoid activation func-
tion. The 1024 dimensional tweet vector obtained from the ELMo embedding
layer is used to train the network.
                   Automated Hate Speech and Offensive Content Detection

Ensemble: The architecture of the Ensemble model used is shown in Fig. 1. It
is the ensemble of the ELMo based NN classifier and the character n-gram based
LR classifier. The prediction from the two classifiers were averaged to obtain the
final prediction.

5    Results
As mentioned in section 3, training of the models for English was performed
after combining the trial and train dataset. The models for Hindi were trained
using the train dataset only. For validation, a stratified split of the dataset was
performed. 20% of the dataset was reserved as the validation dataset and the
remaining 80% was used for training the models. Table 2 and Table 3 presents
the results obtained by our models on the English and Hindi validation dataset
respectively.

                    Table 2. Results of our models for English

    Approach    Features                  Acc       Prec     Rec       F1
    LR          Char n-grams (1 to 6)     62.50     62.08    62.53     61.95
    LR          Word n-grams (1 to 3)     63.60     62.63    62.94     62.69
    LR          Char & Word n-grams       64.07     63.17    63.53     63.23
    SVM         Char n-grams (1 to 6)     65.96     64.37    62.46     62.57
    SVM         Word n-grams (1 to 3)     66.19     64.73    62.47     62.53
    SVM         Char & Word n-grams       64.86     63.10    62.34     62.50
    BiLSTM      pre-trained fastText      67.69     66.87    63.56     63.59
    BiLSTM      pre-trained Glove         64.31     63.12    63.31     63.19
    NN          fine-tuned ELMo           65.8      64.99    60.83     60.26
    Ensemble    ELMo & Char n-grams       65.49     63.89    63.47     63.61

    As can be seen from Table 2 that for English, the BiLSTM model trained on
pre-trained fastText embeddings performed the best on all the metrics consid-
ered. It obtained a macro F1 score of 63.59. The second best F1 score of 63.61 was
obtained using ensemble of ELMo based NN and the character n-gram based LR
model. By itself, the ELMo based NN classifier performed the worst among all
the models with an F1 score of 60.26. However, it had the second-best precision
score of 64.99. Among the LR models, the one trained using both character and
word n-grams preformed the best with an F1 score of 63.23. The performance
of all the SVM models were almost identical.
    From Table 3, it can be seen that for Hindi, the SVM model trained on
character n-grams performed the best on all the metrics considered. The model
obtained an F1 score of 82.73. Word n-gram based models (both LR and SVM)
did not perform well for the Hindi dataset. The BiLSTM model trained using
fastText Hindi embeddings performed the worst with an F1 score of only 54.15.
The reason for this poor performance could be that the dataset was a code-mixed
dataset and it had English words also. Whereas the embeddings used was for
Hindi only.
       A. Baruah et al.

   Table 4 shows the confusion matrix for the LR and SVM models for English.
As can be seen that, among the LR models, the word n-gram based LR models
were better in predicting the non-hate category, while the character n-gram based
model was better in predicting the hate category. Among the SVM models, both
character and word n-gram based models performed equally well in predicting
both the categories. Compared to LR models, the SVM models were better in
predicting the non-hate category while the LR models were better in predicting
the hate category.
   For Hindi, as can be seen from Table 5, both character-based LR and SVM
models performed equally well in predicting the non-hate category. The character-
based SVM models were slightly better in predicting the hate category. Both
word-based LR and SVM models performed poorly in predicting the non-hate
category.
   From Table 6, it can be seen that the ELMo based NN model was the best in
predicting the non-hate category among all the models. However, it was poor in
predicting the hate category. For this reason, it was paired with the character-
based LR model in our ensemble model. The fastText based BiLSTM model was
the second best in predicting the non-hate group. Compared to the ELMo based
NN model, its performance in predicting the hate category was much better.
    Based on these results obtained on the validation dataset, we selected the
following models for submission: fastText based BiLSTM (English Run 1), our
ensemble model (English Run 2), character and word n-gram based LR (English
Run 3), character n-gram based SVM (Hindi Run 1), character n-gram based
LR (Hindi Run 2), and character and word n-gram based SVM (Hindi Run 3).
    The official results for our models are listed in Table 7 and Table 8. As we
made an error in submitting the results for run 3 of the English language, the
results for this run are missing. As can be seen from the tables for English,
our best performing model on the test dataset was the fastText based BiLSTM
model. It obtained a macro F1 score of 74.62. This model obtained the 8th po-
sition among 79 submissions for English. For Hindi, our best performing models
were the character-based LR and SVM models with F1 score of 81.05 and 80.98
respectively. These two models obtained the official ranking of 4th and 5th posi-
tion respectively among 37 submissions made for the Hindi. Table 9 shows the
confusion matrix of our models for the official run.


                     Table 3. Results of our models for Hindi
   Approach     Features                  Acc      Prec         Rec     F1
   LR           Char n-grams (1 to 6)     81.67    81.85        81.91   81.67
   LR           Word n-grams (1 to 3)     77.49    77.57        77.65   77.48
   LR           Char & Word n-grams       81.14    81.18        81.29   81.13
   SVM          Char n-grams (1 to 6)     82.74    82.77        82.88   82.73
   SVM          Word n-grams (1 to 3)     77.38    77.33        77.40   77.34
   SVM          Char & Word n-grams       81.56    81.54        81.65   81.54
   BiLSTM       pre-trained fastText      63.13    62.22        56.65   54.15
                   Automated Hate Speech and Offensive Content Detection


      Table 4. Confusion Matrix of LR and SVM models for the English Dataset

            LR       LR        LR        SVM      SVM        SVM
           Char     Word     Char &      Char     Word     Char &
         n-grams  n-grams Word n-grams n-grams  n-grams Word n-grams
        NOT HOF NOT HOF NOT HOF NOT HOF NOT HOF NOT HOF
     NOT 474  286 504  256 504   256   611  149 620  140 572   188
     HOF 191  321 207  305 201   311   284  228 290  222 259   253


       Table 5. Confusion Matrix of LR and SVM models for the Hindi Dataset
            LR       LR        LR        SVM      SVM        SVM
           Char     Word     Char &      Char     Word      Char &
         n-grams  n-grams Word n-grams n-grams  n-grams Word n-grams
        NOT HOF NOT HOF NOT HOF NOT HOF NOT HOF NOT HOF
     NOT 377   62 353   86 368    71   374  65  341   98 365     74
     HOF 109  385 124  370 105   389   96   398 113  381  98    396


Table 6. Confusion Matrix of BiLSTM, ELMo based NN and Ensemble for English,
and BiLSTM for Hindi
               BiLSTM    BiLSTM ELMo based Ensemble     BiLSTM
                English   English    Neural  ELMo NN &    Hindi
                 Glove   fastText   Network    Char LR  fastText
              NOT HOF NOT HOF NOT HOF NOT HOF NOT HOF
           NOT 520   240 644   116 656   104  561   199 683   77
           HOF 214   298 295   217 331   181  240   272 392  120


                   Table 7. Official results for English Subtask-A

 Run    Model             Accuracy Precision Recall    Macro    Weighted Position
                                                       F1       F1
 1      BiLSTM (fastText) 80.00    74.00     76.00     74.62    80.64    8th
 2      Ensemble    (ELMo 77.00    72.00     77.00     73.21    78.43    15th
        NN + Char LR)
        Best System       -        -         -         78.82    83.95    1st


                    Table 8. Official results for Hindi Subtask-A
 Run    Model             Accuracy Precision Recall    Macro    Weighted Position
                                                       F1       F1
 1      SVM (char)        81.00    81.00     81.00     80.98    81.06    5th
 2      LR (char)         81.00    81.00     81.00     81.05    81.13    4th
 3      SVM (word+char)   80.00    80.00     80.00     79.85    79.93    14th
        Best System       -        -         -         81.49    82.02    1st

                 Table 9. Confusion Matrix from the official results
         English Run 1 English Run 2 Hindi Run 1 Hindi Run 2 Hindi Run 3
        fastText based Ensemble of Char n-gram Char n-gram Char and Word
           BiLSTM      ELMo NN &        based       based    n-gram based
                          Char LR       SVM          LR          SVM
        HOF NOT HOF NOT HOF NOT HOF NOT HOF                        NOT
     HOF 190      98    221     67   499    106  497    108  496    109
     NOT 129     736    195    670   144    569  141    572  156    557
        A. Baruah et al.

6    Conclusion
Hate speech and offensive content in social media is potentially dangerous for the
society. As part of the shared task HASOC, this study used LR, SVM, BiLSTM
and NN models for automated detection of hate speech and offensive content.
Features such as word and character n-grams, Glove, fastText and ELMo em-
beddings were used in the study. Our best models obtained F1 score of 74.62
and 81.05 for English and Hindi dataset respectively. In our study, we did not
use features such as dependency relations, part-of-speech tags etc. Further ex-
periments can be performed to check if these features improve the performance
of the classifier.


References
 1. Badjatiya, P., Gupta, S., Gupta, M., Varma, V.: Deep Learning for Hate Speech
    Detection in Tweets. In: WWW 2017. pp. 759–760. Perth (2017)
 2. Baruah, A., Barbhuiya, F., Dey, K.: Bi-directional LSTM for Hate Speech De-
    tection. In: 13th International Workshop on Semantic Evaluation. pp. 317–376.
    Minneapolis (2019)
 3. Djuric, N., Zhou, J., Morris, R., Grbovic, M., Radosavljevic, V., Bhamidipati, N.:
    Hate Speech Detection with Comment Embeddings. In: WWW 2015. pp. 29–30.
    Florence, Italy (2015)
 4. Gamback, B., Sikdar, U.: Using Convolutional Neural Networks to Classify Hate-
    Speech. In: ALW1 at ACL 2017. pp. 85–90. Vancouver (2017)
 5. Kamble, S., Joshi, A.: Hate Speech Detection from Code-mixed Hindi-English
    Tweets Using Deep Learning Models. In: 15th International Conference on Natural
    Language Processing. pp. 155–160. Punjab, India (2018)
 6. Mathur, P., Shah, R., Sawhney, R., Mahata, D.: Detecting offensive tweets in Hindi-
    English code-switched language. In: Sixth International Workshop on Natural Lan-
    guage Processing for Social Media. pp. 18–26. Melbourne (2018)
 7. Modha, S., Mandl, T., Majumder, P., Patel, D.: Overview of the HASOC track at
    FIRE 2019: Hate Speech and Offensive Content Identification in Indo-European
    Languages. In: Proceedings of the 11th annual meeting of the Forum for Informa-
    tion Retrieval Evaluation (2019)
 8. Nobata, C., Tetreault, J., Thomas, A., Mehdad, Y., Chang, Y.: Abusive Language
    Detection in Online User Content. In: WWW 2016. pp. 145–153. Montreal (2016)
 9. Risch, J., Krestel, R.: Delete or not Delete? Semi-Automatic Comment Moderation
    for the Newsroom. In: TRAC-1 at COLING 2018. pp. 166–176. Santa Fe, USA
    (2018)
10. Santosh, T., Aravind, K.: Hate Speech Detection in Hindi-English Code-Mixed
    Social Media Text. In: ACM India Joint International Conference on Data Science
    and Management of Data. pp. 310–313. Kolkata, India (2019)
11. Serra, J., Leontiadis, I., Spathis, D., Stringhini, G., Blackburn, J., Vakali, A.: Class-
    based Prediction Errors to Detect Hate Speech with Out-of-vocabulary Words. In:
    ALW1 at ACL 2017. pp. 36–40. Vancouver (2017)
12. Waseem, Z.: Are You a Racist or Am I Seeing Things? Annotator Influence on
    Hate Speech Detection on Twitter. In: NLP+CSS at EMNLP 2016. pp. 138–142.
    Austin, USA (2016)