<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Detection of Hate or Offensive Phrase using Magnified Tf-Idf</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Palash Nandi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dipankar Das</string-name>
          <email>dipankar.das@jadavpuruniversity.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Jadavpur University</institution>
          ,
          <addr-line>Kolkata</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The non-negotiable challenge that social media platforms are facing nowadays is the abundant presence of hate speeches in text messages. Thus, automatic hate speech detection becomes an important ethical concern and research should be carried out to overcome this challenge. In the present paper, we propose a tf-idf based binary classification framework that manipulates the scores obtained as the differences between hate and offensive (HOF) words and non-HOF (NOT) words. Employing this framework, we have achieved a Macro F1 score of 0.6813 and 0.6762 for the English and Hindi test datasets, respectively provided in subtask-1A of the HASOC 2021[13] shared task.</p>
      </abstract>
      <kwd-group>
        <kwd>1 hate speech</kwd>
        <kwd>tf-idf</kwd>
        <kwd>HOF</kwd>
        <kwd>NOT</kwd>
        <kwd>MagTIDS</kwd>
        <kwd>NonMagTIDS</kwd>
        <kwd>magnification factor</kwd>
        <kwd>knowledgebase</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>In this paper, we represent a HOF detection framework on behalf of the subtask-1A of HASOC
2021 based on tf-idf and a manually created knowledge base of hate-words for English and Hindi.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related</title>
    </sec>
    <sec id="sec-3">
      <title>Work</title>
      <p>The evolution of research on HOF detection extends from keyword [3,4], distributional semantics
based classifiers [5,6,7] to deep learning based classifiers [8,9,10]. Sood et al. [3] used a list of
profane words, being able to identify 40% of words that are profane and then correctly identifying
52% as HOF or NOT. Mondal et al. [4] used sentence structures and a Hatebase 2 to identify hate
targets. Nobata et al. [5] detected hate speech, profanity, and derogatory language in social media
using n-grams as well as linguistic, syntactic, and distributional semantics. Djuric et al. [6] detected
online hate using word embeddings from a neural network called Paragraph2vec to compare with the
Bag of Words (BOW) model. Saleem et al. [7] used Labeled Latent Dirichlet Allocation (LLDA) to
2Structured repository of regionalized, multilingual hate speech: https ://hatebase.org/
automatically infer topics for the classifier. Park et al. [8] detected racist and sexist language through
a two-step approach with convolutional neural networks. They used three CNN models (CharCNN,
WordCNN, and HybridCNN) on 20K tweets, achieving the best performance with HybridCNN and
the worst with CharCNN. Zhang et al. [9] used a pre-trained word embedding layer to map the text
into vector space, which was then passed through a convolution layer with a max-pooling
downsampling technique. Badjatiya et al. [10] classified the hatefulness of tweets using deep neural
networks.</p>
    </sec>
    <sec id="sec-4">
      <title>3. Task Description</title>
      <p>Subtask-1A of HASOC 2021 strictly focuses on the binary classification of HOF and NOT classes.
The definitions of HOF and NOT class are given below,
•
•</p>
      <p>NOT - A NOT statement does not contain any hate speech, profane, offensive content.</p>
      <p>HOF - A HOF statement contains hate, offensive, and profane content.</p>
      <p>Given a Twitter post, subtask-1A expects the participating system to identify whether it is HOF or
NOT. For example, the Twitter post ‘@TheRealOJ32 Of all the retired NFL players, why is it that you
DON’T suffer from CTE? You should be at the bottom of a pool you mistook for an elevator.
#murderer’ is expected to be identified as HOF as a person or a group of people is targeted with
hateful, offensive statements whereas the Twitter post ‘Empty podiums make too much noise 
#ToryLeadershipDebate #UKPM #BorisJohnsonShouldNotBePM #Leadersdebate #GTTO
#JC4PM2019 #frightnight https://t.co/aDgCqhdDTl’ should be labeled as NOT. The data for
subtask1A of HASOC 2021 is available for English, Hindi, and Marathi. We will use the English and Hindi
dataset for our research work.</p>
    </sec>
    <sec id="sec-5">
      <title>3.1. Data Analysis</title>
      <p>In this section, we discuss the dataset used for creating the knowledge base, HOF_knowledge_base
for both English and Hindi. We have used individual as well as the merged dataset from HASOC
2019, HASOC 2020 [12], and HASOC 2021 [11]. Table 1 represents information about datasets for
both languages.</p>
      <sec id="sec-5-1">
        <title>HASOC_EN_2019</title>
      </sec>
      <sec id="sec-5-2">
        <title>HASOC_EN_2020</title>
      </sec>
      <sec id="sec-5-3">
        <title>HASOC_EN_2021</title>
      </sec>
      <sec id="sec-5-4">
        <title>HASOC_EN_COMBINED</title>
      </sec>
      <sec id="sec-5-5">
        <title>HASOC_HI_2019</title>
      </sec>
      <sec id="sec-5-6">
        <title>English</title>
      </sec>
      <sec id="sec-5-7">
        <title>Hindi</title>
        <p>NOT</p>
      </sec>
      <sec id="sec-5-8">
        <title>HASOC_HI_2020</title>
      </sec>
      <sec id="sec-5-9">
        <title>HASOC_HI_2021 HASOC_HI_COMBINED</title>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>4. Proposed Approach</title>
    </sec>
    <sec id="sec-7">
      <title>4.1. Preprocessing</title>
      <p>Since most of the time Twitter posts do not follow grammatically correct conventions, raw Twitter
posts are not to be directly used for classification. Therefore we opted for a preprocessing pipeline to
refine Twitter data. The steps in the preprocessing pipeline are explained below with help of an
English and a Hindi Twitter post.</p>
      <p>•
•
•
•</p>
      <p>Convert words into the lower case: HOF words are insensitive to letter cases. For that reason,
each word of each sentence is turned into the lower case for the English dataset whereas it is
inapplicable for the Hindi dataset except for the user mentions. For example, in English the
Twitter post ‘@realDonaldTrump Technically that's still turning back the clock, you
FatHead  https://t.co/jbKaPJmpt1’ is turned into ‘‘@realdonaldtrump technically that's
still turning back the clock, you fathead  https://t.co/jbkapjmpt1’’ and in Hindi
‘@AskAnshul,  आसमानी कि ताब े नाजायज औलाद है।’ is turned into ‘@askanshul, 
आसमानी कि ताब े नाजायज औलाद है।’
Replace consecutive spaces with a single space: Twitter posts often contain multiple
consecutive spaces. Those consecutive spaces are identified and replaced by a single space.
For example, the English Twitter post ‘‘@realDonaldTrump technically    that's still turning
back the clock, you fathead   https://t.co/jbkapjmpt1’’ is turned into ‘‘@realDonaldTrump
technically that's still turning back the clock, you fathead   https://t.co/jbkapjmpt1’’ and the
Hindi Twitter post ‘@askanshul,     आसमानी कि ताब े नाजायज औलाद है ।’ is turned into
‘@askanshul,  आसमानी कि ताब े नाजायज औलाद है ।’
Remove user mentions (by @): As our proposed methodology is sensitive towards HOF
phrases only, the presence of any user mentions will not be helpful for the system. For that
reason, any user mentions are removed. For example, the English Twitter post
‘‘@realDonaldTrump technically that's still turning back the clock, you fathead  
https://t.co/jbkapjmpt1’’ is turned into ‘‘technically that's still turning back the clock, you
fathead   https://t.co/jbkapjmpt1’’ and the Hindi Twitter post ‘@askanshul,  आसमानी
कि ताब े नाजायज औलाद है।’ is turned into ‘, आसमानी कि ताब े नाजायज औलाद है।’
Replace emojis with corresponding text: Emojis, when used directly, are not useful in HOF
detection as an individual emoji does not express any hate or offense. But when combined
with context, emojis can be expressive. For an instance ‘’  is neither hateful nor offensive
content but ‘you piece of ’ is considered a derogatory comment. For that reason, all emojis
present in the sentences are replaced with corresponding text. For example, the English
Twitter post ‘‘technically that's still turning back the clock, you fathead  
https://t.co/jbkapjmpt1’’ is turned into ‘‘technically that's still turning back the clock, you
fathead pile of poo https://t.co/jbkapjmpt1’’ and the Hindi Twitter post ‘,  आसमानी कि ताब
े नाजायज औलाद है ।’ is turned into ‘,मल ा ढेर आसमानी कि ताब े नाजायज औलाद है ।’
•
•
•
•</p>
      <p>Remove URL: Often Twitter posts contain a link for supporting images or videos but as the
proposed methodology only considers Twitter text for analysis, the present link is discarded.
For example, the English Twitter post ‘‘technically that's still turning back the clock, you
fathead pile of poo https://t.co/jbkapjmpt1’’ is turned into ‘‘technically that's still turning back
the clock, you fathead pile of poo’’ and the Hindi Twitter post ‘, मल ा ढेर आसमानी कि ताब े
नाजायज औलाद है ।’ remains unaltered as it does not contain any link.</p>
      <p>Expand contracted words: Contracted words are replaced with the equivalent phrase for
better understanding. For example, the English Twitter post ‘‘technically that's still turning
back the clock, you fathead pile of poo’’ is turned into ‘technically that is still turning back
the clock, you fathead pile of poo’ and the Hindi Twitter post ‘, मल ा ढेर आसमानी कि ताब े
नाजायज औलाद है ।’ remains unaltered as it does not contain any contracted word.</p>
      <p>Remove punctuation marks: Any punctuation marks present in the sentence are removed as
punctuation marks are neither part of the hatebase nor contribute to the detection of HOF
phrases. For example, the English Twitter post ‘‘technically that is still turning back the
clock, you fathead pile of poo’’ is turned into ‘technically that is still turning back the clock
you fathead pile of poo’ and the Hindi Twitter post ‘, मल ा ढेर आसमानी कि ताब े नाजायज
औलाद है ।’ is turned into ‘मल ा ढेर आसमानी कि ताब े नाजायज औलाद है ’.</p>
      <p>Remove stop words: Any stop word present in the sentence is detected3 and removed as stop
words are neither part of the hatebase nor contribute to the detection of HOF phrases. For
example, the English Twitter post ‘technically that is still turning back the clock you fathead
pile of poo’ is turned into ‘technically still turning back clock fathead pile poo’ and the Hindi
Twitter post ‘मल ा ढेर आसमानी कि ताब े नाजायज औलाद है ’ is turned into ‘मल ढेर आसमानी
कि ताब नाजायज औलाद’.</p>
      <p>Each word in the semi-processed sentence is lemmatized for the English dataset. For the Hindi
dataset, lemmatization is not used due to the unavailability of a suitable lemmatizer.</p>
    </sec>
    <sec id="sec-8">
      <title>4.2. Creation of MagTIDS</title>
      <p>3Stopwords of English are collected using the NLTK toolkit available at https://www.nltk.org/ and stopwords of Hindi are collected from
the  open-source repository available at https://github.com/Alir3z4/stop-words/blob/master/hindi.txt.
where,
where,</p>
      <p>MagTIDS contains magnified tf-idf difference scores between HOF and NOT classes for each
word. To generate the MagTIDS score of a word word_i, initially, we have calculated the tf-idf score
of word_i w.r.t class HOF and NOT from the training dataset. Then the magnified difference score is
obtained by multiplying a selected magnification factor with the absolute difference between the tf-idf
score of word_i w.r.t. class HOF and NOT. The required formula to calculate the MagTIDS score for
word_i is given below.</p>
      <p>MagTIDS(word_i) = magnification_factor * | tf_idf[word_i][HOF]-tf_idf[word_i][NOT] |
tf_idf[word_i][HOF]: tf-idf score of word_i w.r.t. class HOF
tf_idf[word_i][NOT]: tf-idf score of word_i w.r.t. class NOT</p>
    </sec>
    <sec id="sec-9">
      <title>4.3. Creation of NonMagTIDS</title>
      <p>NonMagTIDS contains non-magnified differences of tf-idf scores between HOF and NOT classes
for each word. To generate the NonMagTIDS score of a word word_i, the absolute difference between
the tf-idf score of word_i w.r.t class HOF and NOT is taken. The required formula to calculate the
NonMagTIDS score for word_i is given below,
NonMagTIDS(word_i) = | tf_idf[word_i][HOF]-tf_idf[word_i][NOT] |
tf_idf[word_i][HOF]: tf-idf score of word_i w.r.t. class HOF
tf_idf[word_i][NOT]: tf-idf score of word_i w.r.t. class NOT</p>
    </sec>
    <sec id="sec-10">
      <title>4.4. MagTIDS based Binary Classification</title>
      <p>Only detecting the important words of respective classes using the tf-idf score is not sufficient for
the classification task as many non-offensive words like 'people', 'india', 'significant', 'like', 'trade',
'face' hold top scores in HOF class. To build the classification model sensitive to HOF words, we
created two parsing modules. Each of them returns a cumulative score after parsing the preprocessed
input string. First, the module parse_HOF uses HOF_knowledge_base and MagTIDS to calculate the
cumulative score when any offensive, hate, or profane word is encountered, and second, the module
parse_NOT uses only NonMagTIDS scores to count the cumulative score over consecutive words.
After parsing with both parse_HOF and parsed_NOT modules, a normalized distribution of
cumulative scores is obtained. Finally, the class corresponding to the module with the highest score is
considered as output.
4.4.1.</p>
    </sec>
    <sec id="sec-11">
      <title>Parsing with module parse_HOF for HOF phrases</title>
      <p>Module parse_HOF is sensitive towards HOF words. It takes a preprocessed string as input and
returns a score based on the presence of HOF words. To identify the HOF words, parse_HOF mainly
uses HOF_knowledge_base and MagTIDS scores. For an input sentence, initially parse_HOF sets the
cumulative score to 0 and iterates over each word of the received preprocessed text. It tries to sense if
any HOF keyword is present in the current text token. To reduce the detection of false-positive HOF
words, a text token ‘word_i’ is considered as HOF if and only if at least one recognized HOF keyword
kw_i, from HOF_knowledge_base, is a substring of word_i and the absolute difference between the
length of word_i and kw_i, not more than two. For example, token ‘banged’ is considered as HOF
text w.r.t. keyword ‘bang’ but ‘bangalore’ is not. Later, If the word is a HOF, the MagTIDS score
corresponding to the matched keyword kw_i is added to the cumulative score, else the NonMagTIDS
score of word_i is added. Figure 1 represents the algorithm of the parse_HOF module below.</p>
    </sec>
    <sec id="sec-12">
      <title>Parsing with module parse_NOT</title>
      <p>Module parse_NOT takes a preprocessed string as input and returns a score based on
NonMagTIDS scores. It does not check sensitivity towards any HOF or NOT words. Initially
parse_NOT sets the cumulative score to 0. Then iterates over each word of the received preprocessed
text and increases the cumulative score by their NonMagTIDS scores. Figure 2 represents the
algorithm of the parse_NOT module.</p>
    </sec>
    <sec id="sec-13">
      <title>5. Results</title>
    </sec>
    <sec id="sec-14">
      <title>5.1. Results using the training dataset</title>
      <p>The proposed classification model is applied to the training dataset for the English and the Hindi
of subtask-1A of HASOC 2021 with the magnification factor from one to a thousand.
5.1.1.</p>
    </sec>
    <sec id="sec-15">
      <title>For the English dataset</title>
      <p>Datasets HASOC_EN_2019, HASOC_EN_2020, HASOC_EN_2021, HASOC_EN_COMBINED
are used to evaluate the proposed classification framework on the English training dataset. The
evaluation scores on the English training datasets are represented in Table 2.</p>
      <p>The column ‘Magnification Factor’ in Table 2 indicates the optimal value for which the proposed
classification model performs best. The performance of the model for each magnification factor (in
the range of one to a thousand) on each dataset in English is represented in Figure 3.</p>
    </sec>
    <sec id="sec-16">
      <title>For the Hindi dataset</title>
      <p>Datasets HASOC_HI_2019, HASOC_HI_2020, HASOC_HI_2021, HASOC_HI_COMBINED are
used to evaluate the proposed classification framework on the Hindi training dataset. The evaluation
scores on the Hindi training dataset are represented in Table 3.</p>
      <p>The column ‘Magnification Factor’ in Table 3 indicates the optimal value for which the proposed
classification model performs best. The performance of the model for each magnification factor (in
the range of one to a thousand) on each dataset in Hindi is represented in Figure 4.</p>
    </sec>
    <sec id="sec-17">
      <title>5.2. Results using the test dataset</title>
      <p>It is observable in Figure 3 and Figure 4, that performance of the proposed classification model
starts to converge around magnification factors 100 and 150 respectively. Therefore a magnification
factor of 100 and 150 is chosen for the evaluation of the proposed classification model on test data for
English and Hindi respectively. The evaluation score for the test dataset is given in Table 4.
subtask-1A</p>
      <p>(English)
subtask-1A</p>
      <p>(Hindi)
6. Analysis</p>
      <p>A few instances of misclassified Twitter posts for both the English and Hindi test datasets are
mentioned in Table 5.</p>
      <p>from subtask-1A (English)
the world suffers a lot. world suffers NOT
not because of the lot violent bad
violent of the bad people people silence
but because of the good people
silence of the good relevant
people." // relevant always
always #bengalburning
#bjp
he fails india, he fails the fails india fails
world, hefails humanity. world fails
#vinashakvista humanity
#resignmodi
https://t.co/3jluapqhuy
you have failed as failed want
#primeminister proper leader
@narendramodi 2013
#modimadedisaster we
want proper
#democracy you are not
that leader you were in
2013. #resignpmmodi
https://t.co/nghswp9ea5
HOF
HOF
100
150
HOF
NOT
NOT
from subtask-1A (Hindi)
4
5
6
फट्टू हैं bjp वाले
#cruelmamata
#bengalviolence
#bengalburning
https://t.co/13vmf806ht
हमारी वाहवाही संपरू्ण# संसार में
है । पर बे शम&amp;ं# से शम# ी दुहाई
क्यों?
#prayforfarmersvictory
#farmersprotest
#resignmodi
https://t.co/iwebqufwdw
गधा
ए</p>
      <p>तू है इसकिलए
ही ब रहा है ।</p>
      <p>HOF</p>
      <p>NOT
फट्टू हैं bjp वाले</p>
      <p>HOF</p>
      <p>NOT
हमारी वाहवाही
संपरू्ण# संसार में है ।
पर बे शम&amp;ं# से शम#
ी दुहाई क्यों</p>
      <p>NOT</p>
      <p>HOF</p>
      <p>It is noticeable that even though instances 1,6 were NOT statements, they were predicted as HOF.
The reason behind this misclassification is that words like ‘suffer’, ‘violent’ of instance 1, and the
word ‘बे शम&amp;’ं# of instance 6, are part of HOF_knowledge_base of the English and the Hindi
respectively. Although words like ‘suffer’, ‘violent’, ‘बे शम&amp;’ं# are not HOF by nature but are highly
associated with HOF posts in the training datasets. As a result, while parsing with module parse_HOF
the cumulative score shoots high which in turn results in misclassification. Also instances 2-5 belong
to HOF but were classified as NOT as they do not contain any foul, offense, or vulgar words in the
statement.</p>
    </sec>
    <sec id="sec-18">
      <title>7. Conclusions and Future</title>
    </sec>
    <sec id="sec-19">
      <title>Work</title>
      <p>We have seen from instances of Table 5 that misclassification occurred when non-HOF words
which are highly associated with HOF context are used in NOT statements or only non-HOF are used
for HOF statements. So, including context information, while classifying a statement can improve the
performance of the model. Although our proposed classification model is able to identify HOF
statements when hate offensive phrases are present in the statement. In future, the usage of different
transformer-based models along with external datasets will be considered for research work.</p>
    </sec>
    <sec id="sec-20">
      <title>8. Acknowledgment</title>
      <p>
        We are thankful
        <xref ref-type="bibr" rid="ref7 ref8">to the organizers of HASOC 2021</xref>
        for providing the opportunity. We acknowledge
all the co-authors also for their efforts and contribution to this research work.
9. References
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>K.</given-names>
            <surname>Saha</surname>
          </string-name>
          , E. Chandrasekharan, M. De Choudhury. '
          <article-title>Prevalence and Psychological Effects of Hateful Speech in Online College Communities</article-title>
          .
          <source>Proc ACM Web Sci Conf</source>
          .
          <year>2019</year>
          ;
          <year>2019</year>
          :
          <fpage>255</fpage>
          -
          <lpage>264</lpage>
          . doi:
          <volume>10</volume>
          .1145/3292522.3326032
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>B.</given-names>
            <surname>Mathew</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Dutt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          . '
          <article-title>Spread of Hate Speech in Online Social Media'</article-title>
          . arXiv preprint arXiv:
          <year>1812</year>
          .01693v1.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>S.O.</given-names>
            <surname>Sood</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. F.</given-names>
            <surname>Churchill</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Antin</surname>
          </string-name>
          . 2012b.
          <article-title>Automatic Identification of Personal Insults on Social News Sites</article-title>
          .
          <source>J. Am. Soc. Inf. Sci. Technol</source>
          .,
          <volume>63</volume>
          (
          <issue>2</issue>
          ):
          <fpage>270</fpage>
          -
          <lpage>285</lpage>
          , February.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Mondal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. A.</given-names>
            <surname>Silva</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Benevenuto</surname>
          </string-name>
          .
          <article-title>A Measurement Study of Hate Speech in Social Media</article-title>
          .
          <source>In HT</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>C.</given-names>
            <surname>Nobata</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tetreault</surname>
          </string-name>
          , A. Thomas,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Mehdad</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chang</surname>
          </string-name>
          .
          <article-title>Abusive Language Detection in Online User Content</article-title>
          .
          <source>In WWW</source>
          , pages
          <fpage>145</fpage>
          -
          <lpage>153</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>N.</given-names>
            <surname>Djuric</surname>
          </string-name>
          ,
          <article-title>Hate Speech Detection with Comment Embeddings</article-title>
          .
          <source>In: Proceedings of the 24th international conference on world wide web</source>
          , New York;
          <year>2015</year>
          . P.
          <volume>29</volume>
          -30
          <string-name>
            <given-names>H. M.</given-names>
            <surname>Saleem</surname>
          </string-name>
          .
          <article-title>A Web of Hate: Tackling Hateful Speech in Online Social Spaces</article-title>
          . arXiv:
          <volume>1709</volume>
          .10159 [cs]. 2017
          <string-name>
            <given-names>J.H.</given-names>
            <surname>Park</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Fung</surname>
          </string-name>
          .
          <article-title>One-step and Two-step Classification for Abusive Language Detection on Twitter</article-title>
          .
          <source>arXiv preprint arXiv:1706.01206</source>
          . 2017
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Robinson</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Tepper</surname>
          </string-name>
          .
          <article-title>Detecting Hate Speech on Twitter using a Convolution - GRU based Deep Neural Network</article-title>
          .
          <source>In Proceedings of ESWC</source>
          <year>2018</year>
          . P. Badjatiya,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gupta</surname>
          </string-name>
          , and
          <string-name>
            <given-names>V.</given-names>
            <surname>Varma</surname>
          </string-name>
          .
          <article-title>Deep Learning for Hate Speech Detection in Tweets</article-title>
          .
          <source>In Proceedings of the 26th International Conference on World Wide Web Companion. International World Wide Web Conferences Steering Committee</source>
          <year>2017</year>
          ,
          <fpage>759</fpage>
          -
          <lpage>760</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Madhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schäfer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zampieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nandini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Jaiswal</surname>
          </string-name>
          ,
          <article-title>Overview of the HASOC subtrack at FIRE 2021: Hate Speech and Offensive Content Identification in English and IndoAryan Languages</article-title>
          , in: Working Notes of FIRE 2021:
          <article-title>Forum for Information Retrieval Evaluation</article-title>
          ,
          <string-name>
            <surname>CEUR</surname>
          </string-name>
          ,
          <year>2021</year>
          . URL: http://ceur-ws.org/.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Jaiswal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nandini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Patel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schäfer</surname>
          </string-name>
          ,
          <article-title>Overview of the HASOC track at FIRE 2020: Hate Speech and Offensive Content Identification in Indo-European Languages</article-title>
          ,
          <source>CoRR abs/2108</source>
          .05927(
          <year>2021</year>
          ). URL: https://arxiv.org/abs/2108.05927, arXiv:
          <fpage>2108</fpage>
          .05927. S. Modha,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Madhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Zampieri, Overview of the Hasoc subtrack</article-title>
          at FIRE 2021:
          <article-title>Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages and Conversational Hate Speech</article-title>
          , in: FIRE 2021:
          <article-title>Forum for Information Retrieval Evaluation, Virtual Event</article-title>
          ,
          <fpage>13th</fpage>
          -17th
          <source>December</source>
          <year>2021</year>
          , ACM,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>