<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Detecting Hate Speech Spreaders on Twitter using LSTM and BERT in English and Spanish</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Moshe Uzan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yaakov HaCohen-Kerner</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computer Science Department, Bar Ilan University</institution>
          ,
          <addr-line>Ramat-Gan 5290002</addr-line>
          ,
          <country country="IL">Israel</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Computer Science Department, Jerusalem College of Technology (Lev Academic Center)</institution>
          ,
          <addr-line>Jerusalem 9116001</addr-line>
          ,
          <country country="IL">Israel</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <abstract>
        <p>In this paper, we describe our submissions for PAN at CLEF 2021 contest. We tackled the subtask “Proifling Hate Speech Spreaders on Twitter”. We developed diferent models for English and Spanish languages, using classic machine learning methods like Support Vector Classifier, Multi-Layer Perceptron, Logistic Regression, Random Forest, Ada-Boost Classifier and K-Neighbors Classifier to more recent deep learning methods like BERT and Bidirectional LSTM.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Author Profiling</kwd>
        <kwd>Hate Speech</kwd>
        <kwd>Twitter</kwd>
        <kwd>Spanish</kwd>
        <kwd>English</kwd>
        <kwd>BERT</kwd>
        <kwd>LSTM</kwd>
        <kwd>Logistic Regression</kwd>
        <kwd>SVM</kwd>
        <kwd>MLP</kwd>
        <kwd>Random Forest</kwd>
        <kwd>Ada-Boost Classifier</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <p>
        Early works [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ] referred to hate speech as abusive and hostile messages or flames . Recent
authors [
        <xref ref-type="bibr" rid="ref6 ref7 ref8">6, 7, 8</xref>
        ] preferred to employ the term cyberbullying. However, more terms related to
hate speech are often used in the NLP community, such as: discrimination, flaming, abusive
language, profanity, toxic language or comment [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. But, in defining this phenomenon, the words
hate speech tends to be used the most [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>
        Identifying if a text contains hateful language is not an easy task, even not for humans.
However, there is not one formal definition of hate speech, a common definition is given
by [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] as any communication that disparages a person or a group on the basis of some
characteristic such as race, color, ethnicity, gender, sexual orientation, nationality, religion, or
other characteristic [
        <xref ref-type="bibr" rid="ref10 ref12 ref13 ref14 ref15 ref9">9, 10, 12, 13, 14, 15</xref>
        ]. Some examples are given by Biere et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] and de
Gilbert et al. [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]:
1. God bless them all, to hell with the black.
2. Wipe out the Jews.
3. Women are like grass, they need to be beaten/cut regularly.
      </p>
      <p>
        Fortuna and Nunes [16] noted in their survey paper that for hate speech detection the most
used approach is the supervised one with a focus on support vector machines (SVM) ([17], [18],
[19]) followed by Random Forests [20], and Decision Trees [21]. Schmidt and Wiegand [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]
found that recurrent neural networks (RNN) are also very common [22].
      </p>
      <p>Badjatiya et al. [23] proposed a deep learning approach and obtained very good results using
word embeddings. Zampieri et al. [24] showed that n-grams can perform well for hate speech
detection using SVMs with diferent surface-level features, such as surface n-grams, word
skip-grams, and word representation n-grams induced with Brown clustering. They also noticed
that these features reached their limits for more complex tasks, e.g., distinguishing profanity
and hate speech. In such tasks, more in-depth linguistic characteristics may be required. But
with the recent arrival of attention mechanism [25] and Transfomers [26] in NLP and especially
with the development of language representation like BERT [27].</p>
      <p>
        Schmidt and Wiegand [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] noted that in addition to the absence of conventional terminology
issue, mentioned above, the lack of common datasets, to conduct research on it, is a challenging
obstacle to progress in this area. Indeed making judgements about the general efectiveness
or non-efectiveness of research conducted on various datasets can be inconsistent. For better
consistency and comparability of diferent features and developed methods, they argue for a
benchmark datasets for hate speech detection. This is the approach suggested by competition
such as PAN at CLEF 2021 [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] which provide the same dataset to all the participants and publish
the method and the results of each participant method of detection according to this benchmark
dataset.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Experimental Results and Submitted Models</title>
      <sec id="sec-3-1">
        <title>3.1. Task dataset</title>
        <p>PAN at CLEF 2021 with the subtask “Profiling Hate Speech Spreaders on Twitter” proposed an
original task by asking a model that classify a user to hate speach spreader instead of predicting
if a post is hateful. For each user we were given 200 tweets and we need to classify it as hate
speach spreader or not. The complexity of the task follows from the fact that only 200 users
tweets was given as training set meaning we 200 cluster of 200 tweets and a label for each cluster.
This task must be performed on 2 languages English and Spanish increasing the dificulty since
models giving good results in one language will give less good ones in the other.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Basic models</title>
        <p>First, we split the tweets written 200 users to train and validation set with 20 percent of the
given data what give us 160 labeled users for the train and 40 ones for validation (with 200
tweets for each user). Like in our precedent work [28], we began with basic model like Support
Vector Classifier (SVC), Multi-Layer Perceptron (MLP) or Logistic Regression but also more
sophisticated one like Random Forest (RF), Ada-Boost Classifier (ABF) and K-Neighbors
Classifier (KN) using classical feature like char ngram features and word ngram features. Some
model gave us very good accuracy but given that the dataset is relatively small this was not
representative. So we retry this experiments using 10 cross-fold validation. We get less good
result but it seems more representative.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Deep learning models</title>
        <p>We realized that using basic model can lead in a significantly lower accuracy on the test set
compared to its cross-validation results so we try going beyond and experiment more deep
approaches using Bert as language representation of tweets. We used pretrained Bert model and
in English we used [27] and for spanish we used [29]. For each tweet we get the corresponding
BERT representation. From there, we tried diferent method.</p>
        <sec id="sec-3-3-1">
          <title>Spanish Table 1: Accuracy results of our first models.</title>
          <p>First one was by feding into two successive relu-activated dense layers first with 256
outfeatures and second with 64 out-features the 200 representated tweets. After that we obtain one
64 vector using a mean operation on 200 vectors. Finally we have a relu-activated dense layers
that classify this to hateful or not.</p>
          <p>The second model we developped take the 200 Bert representation vectors and fed them into
a Bi-LSTM with 2 x 32 features in hidden layer. Finally we have a relu-activated dense layers
that classify the 64 feature output to hateful or not. We use Adamw[30] variant of Adam [31]
algorithm as optimizer for each model 2. After our first submission we noticed that there was
a rather big gap in English between the result obtained on our development set and the final
result on the test set. So we decided to increase dropout rates and use a BERT model that had
been trained on tweets [32].</p>
        </sec>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Experimental Results</title>
        <p>Firstly, we submit two models the averaging one for English and the one using LSTM for Spanish.
For the English we get an accuracy of 0.70 in our splitted set and 81 for the Spanish one. We get
an accuracy of 0.62 in English and 0.70 in Spanish giving an overall accuracy of 0.66. We then
submitted second time two model for the English one we keep the same but for Spanish we
switch to the averaging model with diferent training parameter. Surprisingly, the final results
2For more precise details about dropout or batch used we publish the code in github https://github.com/
machouz/pan_transformers
3. showed that, contrary to our observations, traditional methods give very good results (see
Table 2). The best result was obtened by SiinoDiNuovo getting an accuracy of 0.73 in English
and 85 in Spanish. We tied for 43rd with this result.</p>
        <sec id="sec-3-4-1">
          <title>Model</title>
        </sec>
        <sec id="sec-3-4-2">
          <title>SiinoDiNuovo</title>
          <p>char nGrams+Logistic
AveragingBERT
MBERT-LSTM
Bi-LSTM-BERT
TFIDF-LSTM</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusions and Future Work</title>
      <p>In this paper, we described the submitted models for the Profiling Hate Speech Spreaders on
Twitter task at PAN 2021. Originally, we looked at a number of machine learning models using
basic features. However, we finally turned to more deep learning models. These deep learning
models generally do well in the tasks to which they are submitted and this is what we observed
through our research. Our final model consist of using Bert as language representation, and
Average or LSTM to make the classification. The dificulty here was to deal with the limited
amount of given data. Our overall accuracy in our first submission was 69. Classifying a tweet
post still remain a dificult task considering Twitter-style informal written genres.</p>
      <p>
        Many tweets contain acronyms that can be presented in diferent forms. These acronyms
can lead to ambiguity. Future research may look for other ways to lessen this ambiguity.
Acronym disambiguation [33], will extend and enrich the tweet’s text and might enable better
classification. We also suggest examining the usefulness of skip character n-grams because they
serve as generalized ngrams that allow us to overcome problems such as noise and sparse data
[34]. Other ideas that may lead to better classification are to use stylistic feature sets [
        <xref ref-type="bibr" rid="ref16">35</xref>
        ], key
phrases [
        <xref ref-type="bibr" rid="ref17">36</xref>
        ], and summaries [
        <xref ref-type="bibr" rid="ref18">37</xref>
        ].
      </p>
      <p>
        Final result shows that more traditional methods may turn out more relevant. These methods
can be combined with k-fold cross-validation (see [
        <xref ref-type="bibr" rid="ref19">38</xref>
        ]), especially when, like in this contest,
available data is limited [28].
      </p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>We thank the Bar-Ilan Data Science Institute for kindly providing server for training our models.
Without their support, this research would not have been possible. We are also grateful to the
organizers and reviewers who gave us the opportunity to do this research.</p>
      <p>3To see the whole table of results https://pan.webis.de/clef21/pan21-web/author-profiling.html#results
tail on twitter, Semantic Web 10 (2019) 925–945.
[16] P. Fortuna, S. Nunes, A survey on automatic detection of hate speech in text, ACM</p>
      <p>Computing Surveys (CSUR) 51 (2018) 1–30.
[17] S. Malmasi, M. Zampieri, Detecting hate speech in social media, arXiv preprint
arXiv:1712.06427 (2017).
[18] T. Davidson, D. Warmsley, M. Macy, I. Weber, Automated hate speech detection and the
problem of ofensive language, in: Eleventh international aaai conference on web and
social media, 2017.
[19] D. Robinson, Z. Zhang, J. Tepper, Hate speech detection on twitter: feature engineering vs
feature selection, in: European Semantic Web Conference, Springer, 2018, pp. 46–49.
[20] P. Burnap, M. L. Williams, Us and them: identifying cyber hate on twitter across multiple
protected characteristics, EPJ Data science 5 (2016) 11.
[21] P. Burnap, M. L. Williams, Hate speech, machine classification and statistical modelling
of information flows on twitter: Interpretation and communication for policy decision
making (2014).
[22] J. Pavlopoulos, P. Malakasiotis, I. Androutsopoulos, Deep learning for user comment
moderation, arXiv preprint arXiv:1705.09993 (2017).
[23] P. Badjatiya, S. Gupta, M. Gupta, V. Varma, Deep learning for hate speech detection
in tweets, in: Proceedings of the 26th International Conference on World Wide Web
Companion, 2017, pp. 759–760.
[24] M. Zampieri, S. Malmasi, P. Nakov, S. Rosenthal, N. Farra, R. Kumar, Predicting the type
and target of ofensive posts in social media, arXiv preprint arXiv:1902.09666 (2019).
[25] D. Bahdanau, K. Cho, Y. Bengio, Neural machine translation by jointly learning to align
and translate, 2014. arXiv:1409.0473.
[26] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I.
Polosukhin, Attention is all you need, 2017. arXiv:1706.03762.
[27] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional
transformers for language understanding, 2018. arXiv:1810.04805.
[28] M. Uzan, Y. HaCohen-Kerner, Jct at semeval-2020 task 12: Ofensive language detection in
tweets using preprocessing methods, character and word n-grams, in: Proceedings of the
Fourteenth Workshop on Semantic Evaluation, 2020, pp. 2017–2022.
[29] J. Cañete, G. Chaperon, R. Fuentes, J.-H. Ho, H. Kang, J. Pérez, Spanish pre-trained bert
model and evaluation data, in: PML4DC at ICLR 2020, 2020.
[30] I. Loshchilov, F. Hutter, Decoupled weight decay regularization, arXiv preprint
arXiv:1711.05101 (2017).
[31] D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, arXiv preprint
arXiv:1412.6980 (2014).
[32] D. Q. Nguyen, T. Vu, A. T. Nguyen, BERTweet: A pre-trained language model for English
Tweets, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language
Processing: System Demonstrations, 2020.
[33] Y. HaCohen-Kerner, H. Beck, E. Yehudai, D. Mughaz, Stylistic feature sets as classifiers
of documents according to their historical period and ethnic origin, Applied Artificial
Intelligence 24 (2010) 847–862.
[34] Y. HaCohen-Kerner, Z. Ido, R. Ya’akobov, Stance classification of tweets using skip char</p>
    </sec>
    <sec id="sec-6">
      <title>A. Online Resources</title>
      <sec id="sec-6-1">
        <title>The sources for this work are available via</title>
        <p>• GitHub,</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chulvi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. L. D. L. P.</given-names>
            <surname>Sarracén</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kestemont</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Manjavacas</surname>
          </string-name>
          , I. Markov,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wolska</surname>
          </string-name>
          , , E. Zangerle, Overview of PAN 2021:
          <article-title>Authorship Verification,Profiling Hate Speech Spreaders on Twitter,and Style Change Detection</article-title>
          ,
          <source>in: 12th International Conference of the CLEF Association (CLEF</source>
          <year>2021</year>
          ), Springer,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. L. D. L. P.</given-names>
            <surname>Sarracén</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chulvi</surname>
          </string-name>
          , E. Fersini,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <source>Profiling Hate Speech Spreaders on Twitter Task at PAN</source>
          <year>2021</year>
          ,
          <article-title>in: CLEF 2021 Labs and Workshops, Notebook Papers, CEUR-WS</article-title>
          .org,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Gollub</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          , TIRA Integrated Research Architecture, in: N.
          <string-name>
            <surname>Ferro</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          Peters (Eds.),
          <source>Information Retrieval Evaluation in a Changing World, The Information Retrieval Series</source>
          , Springer, Berlin Heidelberg New York,
          <year>2019</year>
          . doi:
          <volume>10</volume>
          .1007/ 978-3-
          <fpage>030</fpage>
          -22948-1\_5.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>E.</given-names>
            <surname>Spertus</surname>
          </string-name>
          ,
          <article-title>Smokey: Automatic recognition of hostile messages</article-title>
          , in: Aaai/iaai,
          <year>1997</year>
          , pp.
          <fpage>1058</fpage>
          -
          <lpage>1065</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Kaufer</surname>
          </string-name>
          ,
          <article-title>Flaming: A white paper</article-title>
          , Department of English, Carnegie Mellon University,
          <source>Retrieved July</source>
          <volume>20</volume>
          (
          <year>2000</year>
          )
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>J.-M. Xu</surname>
            ,
            <given-names>K.-S.</given-names>
          </string-name>
          <string-name>
            <surname>Jun</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          <string-name>
            <surname>Zhu</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Bellmore</surname>
          </string-name>
          ,
          <article-title>Learning from bullying traces in social media, in: Proceedings of the 2012 conference of the North American chapter of the association for computational linguistics: Human language technologies</article-title>
          ,
          <year>2012</year>
          , pp.
          <fpage>656</fpage>
          -
          <lpage>666</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>H.</given-names>
            <surname>Hosseinmardi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Mattson</surname>
          </string-name>
          ,
          <string-name>
            <surname>R. I. Rafiq</surname>
          </string-name>
          , R. Han,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Lv</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mishra</surname>
          </string-name>
          ,
          <article-title>Detection of cyberbullying incidents on the instagram social network</article-title>
          ,
          <source>arXiv preprint arXiv:1503.03909</source>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. C.</given-names>
            <surname>Squicciarini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Rajtmajer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Grifin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Caragea</surname>
          </string-name>
          ,
          <article-title>Content-driven detection of cyberbullying on the instagram social network</article-title>
          .,
          <source>in: IJCAI</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>3952</fpage>
          -
          <lpage>3958</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Schmidt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegand</surname>
          </string-name>
          ,
          <article-title>A survey on hate speech detection using natural language processing</article-title>
          ,
          <source>in: Proceedings of the Fifth International workshop on natural language processing for social media</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S.</given-names>
            <surname>Biere</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bhulai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. B.</given-names>
            <surname>Analytics</surname>
          </string-name>
          ,
          <article-title>Hate speech detection using natural language processing techniques</article-title>
          ,
          <source>Master Business AnalyticsDepartment of Mathematics Faculty of Science</source>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J. T.</given-names>
            <surname>Nockleby</surname>
          </string-name>
          , Hate speech,
          <source>Encyclopedia of the American constitution 3</source>
          (
          <year>2000</year>
          )
          <fpage>1277</fpage>
          -
          <lpage>1279</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Robinson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tepper</surname>
          </string-name>
          ,
          <article-title>Detecting hate speech on twitter using a convolution-gru based deep neural network</article-title>
          ,
          <source>in: European semantic web conference</source>
          , Springer,
          <year>2018</year>
          , pp.
          <fpage>745</fpage>
          -
          <lpage>760</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>O. de Gibert</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Perez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>García-Pablos</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Cuadros</surname>
          </string-name>
          ,
          <article-title>Hate speech dataset from a white supremacy forum</article-title>
          , arXiv preprint arXiv:
          <year>1809</year>
          .
          <volume>04444</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>V.</given-names>
            <surname>Basile</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bosco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Fersini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Debora</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Patti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. M. R.</given-names>
            <surname>Pardo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sanguinetti</surname>
          </string-name>
          , et al.,
          <article-title>Semeval-2019 task 5: Multilingual detection of hate speech against immigrants and women in twitter</article-title>
          ,
          <source>in: 13th International Workshop on Semantic Evaluation, Association for Computational Linguistics</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>54</fpage>
          -
          <lpage>63</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , L. Luo,
          <article-title>Hate speech detection: A solved problem? the challenging case of long ngrams</article-title>
          ,
          <source>in: Joint European Conference on Machine Learning and Knowledge Discovery in Databases</source>
          , Springer,
          <year>2017</year>
          , pp.
          <fpage>266</fpage>
          -
          <lpage>278</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>Y.</given-names>
            <surname>HaCohen-Kerner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kass</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Peretz</surname>
          </string-name>
          ,
          <article-title>Haads: A hebrew aramaic abbreviation disambiguation system</article-title>
          ,
          <source>Journal of the American Society for Information Science and Technology</source>
          <volume>61</volume>
          (
          <year>2010</year>
          )
          <fpage>1923</fpage>
          -
          <lpage>1932</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>Y.</given-names>
            <surname>HaCohen-Kerner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Stern</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Korkus</surname>
          </string-name>
          ,
          <string-name>
            <surname>E. Fredj,</surname>
          </string-name>
          <article-title>Automatic machine learning of keyphrase extraction from short html documents written in hebrew</article-title>
          ,
          <source>Cybernetics and Systems: An International Journal</source>
          <volume>38</volume>
          (
          <year>2007</year>
          )
          <fpage>1</fpage>
          -
          <lpage>21</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>Y.</given-names>
            <surname>HaCohen-Kerner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Malin</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Chasson</surname>
          </string-name>
          ,
          <article-title>Summarization of jewish law articles in hebrew</article-title>
          .,
          <source>in: CAINE</source>
          ,
          <year>2003</year>
          , pp.
          <fpage>172</fpage>
          -
          <lpage>177</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bengio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Grandvalet</surname>
          </string-name>
          ,
          <article-title>No unbiased estimator of the variance of k-fold cross-validation</article-title>
          ,
          <source>Journal of machine learning research 5</source>
          (
          <year>2004</year>
          )
          <fpage>1089</fpage>
          -
          <lpage>1105</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>