<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Detection in Hindi and</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Raviraj Joshi</string-name>
          <email>ravirajoshi@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Abhishek Velankar</string-name>
          <email>velankarabhishek@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hrushikesh Patil</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Amol Gore</string-name>
          <email>amolgore2512@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shubham Salunke</string-name>
          <email>shubhamsalunke30012001@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>BERT, Hate Speech Detection.</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Indian Institute of Technology Madras</institution>
          ,
          <addr-line>Chennai, Tamilnadu</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Natural Language Processing, Convolutional Neural Networks</institution>
          ,
          <addr-line>Long Short Term Memory, FastText</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Pune Institute of Computer Technology</institution>
          ,
          <addr-line>Pune, Maharashtra</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Sentiment analysis is the most basic NLP task to determine the polarity of text data. There has been a significant amount of work in the area of multilingual text as well. Still hate and ofensive speech detection faces a challenge due to inadequate availability of data, especially for Indian languages like Hindi and Marathi. In this work, we consider hate and ofensive speech detection in Hindi and Marathi texts. The problem is formulated as a text classification task using the state of the art deep learning approaches. We explore diferent deep learning architectures like CNN, LSTM, and variations of BERT like multilingual BERT, IndicBERT, and monolingual RoBERTa. The basic models based on CNN and LSTM are augmented with fast text word embeddings. We use the HASOC 2021 Hindi and Marathi hate speech datasets to compare these algorithms. The Marathi dataset consists of binary labels and the Hindi dataset consists of binary as well as more-fine grained labels. We show that the transformer-based models perform the best and even the basic models along with FastText embeddings give a competitive performance. Moreover, with normal hyper-parameter tuning, the basic models perform better than BERT-based models on the fine-grained Hindi dataset.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Marathi</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        Hate speech is often defined as the use of hateful language with the intent of attacking a person
or a group to provoke, intimidate, express contempt or cause harm to them or on the basis of
their race, religion, ethnic origin, disability or gender [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. The advancement of technology
has led to an increase in the use of social media and its accessibility across the globe. Several
online social media users post harmful content without realizing that their actions often ofend a
person or a group of people [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ]. It is therefore important to automatically detect and filter out
such harmful content from the massive textual content being posted online every day [
        <xref ref-type="bibr" rid="ref5 ref6 ref7">5, 6, 7</xref>
        ].
      </p>
      <p>
        Hindi is one of the oficial languages of India and is spoken by around 45% of its population
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Due to its popularity in India, there are a large number of social media activities performed
in the Hindi language written in Devanagari script. It is therefore important to detect hate
speech in the Hindi language.
      </p>
      <p>
        Marathi is the native language of Maharashtra state in India. It is spoken by around 83 million
people all over the country and it ranks as the third most spoken language in India. People find
it easier to express their opinions in regional languages and hence social media activities in
Marathi have been quite popular among the Marathi-speaking diaspora. Most of the work in
the area of sentiment analysis and hate speech detection has been concentrated on English [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
Exposure to native low-resource languages has been increasing in recent times. We specifically
focus on low-resource Indian languages Hindi and Marathi.
      </p>
      <p>In this work, we treat hate speech detection as a text classification problem and explore
various deep learning approaches for the task. The datasets used are provided in the HASOC
2021 shared task [10]. These datasets consisted of text from diferent Twitter posts, tagged
manually as hate and non-hate. The Marathi dataset has binary labels whereas the Hindi dataset
consists of binary labels as well as more fine-grained labels namely none, hate, ofensive, and
profane. We analyze CNN and LSTM based models for the binary classification task. The word
embeddings are initialized using corresponding Hindi or Marathi FastText word vectors. We
also evaluated transformer-based models, particularly variations of BERT such as indicBERT,
mBERT, RoBERTa for Hindi and Marathi [11, 12]. A hierarchical approach is used for the
ifne-grained 4-class classification task in Hindi where we first distinguish the text between
hate and non-hate class and use the text with hate class for further classification into three
labels including HATE, OFFN, and PRFN. The hierarchical approach is compared with its direct
multi-class counterpart. The best BERT models for each of the tasks are shared publicly1 2 3.</p>
    </sec>
    <sec id="sec-3">
      <title>2. Related Work</title>
      <p>The low resource nature of Hindi and Marathi languages has limited the extent of work on hate
speech detection in these languages. Typical deep learning models like CNN 1D, LSTM, and
BiLSTM along with domain-specific word embeddings were evaluated on Hindi-English code
mixed dataset in [13]. They also showed that the above deep learning models perform way
better than traditional machine learning approaches such as SVM, and random forests.</p>
      <p>A comparative study between machine learning and deep learning architectures for hate
speech detection is proposed in [14] where datasets containing English tweets have been used.
Diferent combinations of feature engineering have been experimented which include machine
learning models like Logistic Regression, Decision Trees, Random Forest, Naive Bayes, etc with
TF-IDF and BOW vectorizers. Pre-trained embeddings GLoVe and custom word vectors have
been used to train LSTM and GRU models.</p>
      <p>In [15] authors compared diferent machine learning and neural network approaches for hate
text speech detection in Hindi, with further classification in hate, ofensive, and profane. The
1https://huggingface.co/l3cube-pune/hate-bert-hasoc-marathi
2https://huggingface.co/l3cube-pune/hate-roberta-hasoc-hindi
3https://huggingface.co/l3cube-pune/hate-multi-roberta-hasoc-hindi
classical machine learning models like Linear SVM, Adaboost or Adaptive Boosting, Random
Forests, Voting Classifier were used and LSTM based deep learning approaches were also
used. They observed that machine learning models work better than neural network models in
low-resource settings.</p>
      <p>
        Various Hindi text classification approaches have been studied in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] using BOW, CNN, LSTM,
BiLSTM, BERT, and LASER models. The work is particularly focused on Hindi text classification.
It is shown that CNN with Hindi fast text embeddings performs the best. Additionally LASER
has given very close results to the best performing model as compared to BERT.
      </p>
      <p>In [16] authors propose approaches for hostile post detection in Hindi. Tests were performed
on models like CNN, Multi-CNN, BiLSTM, CNN+BiLSTM, IndicBert, mBert along with FastText
embedding provided by both IndicNLP and Facebook. This work shows that BERT-based models
work slightly better than basic models. The multi CNN model with IndicNLP FastText word
embedding performs best within the basic models.</p>
    </sec>
    <sec id="sec-4">
      <title>3. Dataset Details</title>
      <p>We used the hate speech detection datasets provided in HASOC 2021 shared task for Hindi and
Marathi. The text for both datasets was obtained from Twitter. Marathi Dataset Description
[17]: The dataset consisted of 1874 training samples with an average of 13 words per sentence.
The class-wise details are shown in Table 1. It contained a total of 625 testing samples. Hindi</p>
      <p>Hate and Ofensive content
Does not contain any hate, ofensive, profane content
Dataset Description [18]: The Hindi training dataset included a total of 4594 training samples
which were divided into two tasks with an average of 26 words per sentence. Task 1 contained
binary labels similar to Marathi i.e. HOF and NOT. Task 2 contained multiclass labels with 4
classes namely NONE, OFFN, HATE, PRFN. Even though labels in task 2 may sound similar,
they are diferent by meaning as described in Table 2. A total of 1532 test samples was provided
for both tasks.</p>
    </sec>
    <sec id="sec-5">
      <title>4. Model Architectures</title>
      <p>We are using common deep learning text classification approaches for the task of Hate speech
detection [19]. The models are used directly for binary classification tasks whereas a
hierarchical approach is used for multi-labeled fine-grained classification. For each of the models, we
selected the epoch giving maximum validation accuracy. We used the learning rate of 0.001 for
CNN and LSTM based models whereas 5e-5 for BERT-based models. The general flow of the
classification process is outlined in Figure 1 and Figure 2. The models and the configurations
NONE
OFFN
HATE
PRFN</p>
      <p>Task 1</p>
      <p>A Description</p>
      <p>Hate and Ofensive content
Does not contain any hate, ofensive, profane content</p>
      <p>Task 2
Does not contain any hate or ofensive content</p>
      <p>The posts contain ofensive language</p>
      <p>Hate speech content
Profane words are used
are outlined below.</p>
      <p>CNN: The basic CNN model used a 1D convolution layer with a filter of size 300 and
kernel of size 3 with relu activation, followed by max-pooling with pool size 2, the same layers
were added again, followed by 1D global max pooling. This is followed by a dense layer of size
50 and relu activation. Finally, the last layer with 2 nodes with softmax activation was used.
Dropout of 0.3 was used after the 1D max-pooling layer.</p>
      <p>LSTM: For the LSTM model LSTM layer with 32 nodes followed by 1D global max-pooling was
used, then a dense layer with 16 nodes along with relu activation was used, followed by 0.2
dropout and finally, a dense layer with 2 nodes with softmax activation was used.
BiLSTM: Bi-LSTM layer with 300 nodes followed by 1D global max-pooling layer was used,
then dense layer with 100 nodes along with activation relu was used. This was followed by a
dropout of 0.2, then the final layer with 2 nodes with activation softmax was used.
BERT: BERT is a pre-trained language model on a large publicly available text corpus. It
is a transformer-based model which is bi-directional in nature. It is pre-trained using two
tasksmasked language modeling and next sentence prediction. We evaluated some variations of
BERT for both Hindi and Marathi tasks [20].</p>
      <p>• Multilingual BERT4: Pre-trained on 104 top languages worldwide including Hindi and</p>
      <p>Marathi.
• IndicBERT5: Pre-trained on 12 major Indian languages released by Ai4Bharat.
• roberta-base-mr6: Released by flax-community, pre-trained on Marathi with masked
language modeling objective.
• roberta-Hindi7: RoBERTa base model for Hindi released by flax-community.
• indic-transformers-hi-bert8: BERT model pretrained on OSCAR corpus released by
neuralspace-reverie.</p>
      <p>Hierarchical Approach:
• The first model is trained on task 1 having binary labels HOF and NOT.
• The second model is trained on ternary labels defined in Task 2 by removing entries
having NONE values. The ternary labels include OFFN, HATE, and PRFN.
• The test data is passed through the first model to get the corresponding output labels as</p>
      <p>HOF or NOT.
• The samples predicted as HOF labels are further passed to the second model for classifying
them into HATE, OFFN, and PREN labels, results from both models are then combined
for the final result.
4https://huggingface.co/bert-base-multilingual-cased
5https://huggingface.co/ai4bharat/indic-bert
6https://huggingface.co/flax-community/roberta-base-mr
7https://huggingface.co/flax-community/roberta-hindi
8https://huggingface.co/neuralspace-reverie/indic-transformers-hi-bert</p>
    </sec>
    <sec id="sec-6">
      <title>5. Results and Discussion</title>
      <p>Hindi Task 1
In this work, the performance of diferent CNN and LSTM based models with and without
FastText embeddings was evaluated on HASOC 2021 Marathi and Hindi datasets. Additionally,
transformer-based models, particularly variations of BERT were used for comparison. Firstly,
all three basic models CNN, LSTM, BiLSTM were trained with random word embedding
initialization. The word embeddings were also initialized using pre-trained fast text embedding by
IndicNLP and then used in trainable or static mode. The non-trainable fasttext embedding seems
more promising than trainable fasttext and random embedding. In this case, the embeddings do
not overfit the training data. The results of the basic models are described in Table 3.
Table 4 summarizes the performances of diferent BERT models. It shows that indic BERT
outperforms others in Marathi. For Hindi task 1, the RoBERTa Hindi base model performs the
best. For Hindi Task 2, a hierarchical approach is used where two RoBERTa Hindi base models
were trained, first for binary and second for ternary classification removing the NONE values.
This model performs better than direct multiclass classification but slightly lower than FastText
+ CNN setting for Task 2. We observe that BERT models are more susceptible to data imbalance
in Hindi fine-grained task and requires oversampling from underrepresented classes. Whereas
basic models are more robust to such imbalance, the direct 4-way approach performs better
than hierarchical classification. The confusion matrices for the best model in each task is shown
in Figure 3.</p>
    </sec>
    <sec id="sec-7">
      <title>6. Conclusion</title>
      <p>In this work, we compared diferent deep learning approaches on Hindi and Marathi datasets
from the HASOC 2021 shared task. The task included both binary and multiclass classification.
For binary classification in Marathi and Hindi task 1, CNN and LSTM based models were used
along with random and FastText embeddings. Out of these, the LSTM + non-trainable FastText
setting worked the best for Marathi. In the case of Hindi, BiLSTM + non-trainable FastText
performed better. Additionally, we experimented on diferent transformer-based BERT models
like indicBERT, mBERT, RoBERTa-base for Marathi and RoBERTa base, and Neural space BERT
for Hindi. IndicBERT outperformed other models for Marathi whereas RoBERTa performed the
best for Hindi. The same RoBERTa model was used for the hierarchical approach. We show
that transformer-based models perform better for binary tasks, but even basic models perform
competitively. For Hindi task 2, it is shown that CNN + non-trainable FastText model performs
slightly better than RoBERTa Hindi model.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>This work was done under the L3Cube Pune mentorship program. We would like to express our
gratitude towards our mentors at L3Cube for their continuous support and encouragement.
on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, 2021,
pp. 213–220.
[10] S. Modha, T. Mandl, G. K. Shahi, H. Madhu, S. Satapara, T. Ranasinghe, M. Zampieri,
Overview of the HASOC Subtrack at FIRE 2021: Hate Speech and Ofensive Content
Identification in English and Indo-Aryan Languages and Conversational Hate Speech, in:
FIRE 2021: Forum for Information Retrieval Evaluation, Virtual Event, 13th-17th December
2021, ACM, 2021.
[11] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional
transformers for language understanding, arXiv preprint arXiv:1810.04805 (2018).
[12] D. Kakwani, A. Kunchukuttan, S. Golla, N. Gokul, A. Bhattacharyya, M. M. Khapra, P.
Kumar, inlpsuite: Monolingual corpora, evaluation benchmarks and pre-trained multilingual
language models for indian languages, in: Proceedings of the 2020 Conference on Empirical
Methods in Natural Language Processing: Findings, 2020, pp. 4948–4961.
[13] S. Kamble, A. Joshi, Hate speech detection from code-mixed hindi-english tweets using
deep learning models, 2018. a r X i v : 1 8 1 1 . 0 5 1 4 5 .
[14] T. Dhamija, Anjum, R. Katarya, Comparative analysis of machine learning and deep
learning algorithms for detection of online hate speech, 2021. a r X i v : 2 1 0 8 . 0 1 0 6 3 .
[15] V. Mujadia, P. Mishra, D. Sharma, Iiit-hyderabad at hasoc 2019: Hate speech detection, in:</p>
      <p>FIRE, 2019.
[16] R. Joshi, R. Karnavat, K. Jirapure, R. Joshi, Evaluation of deep learning models for
hostility detection in hindi text, in: 2021 6th International Conference for Convergence in
Technology (I2CT), IEEE, 2021, pp. 1–5.
[17] S. Gaikwad, T. Ranasinghe, M. Zampieri, C. M. Homan, Cross-lingual ofensive language
identification for low resource languages: The case of marathi, in: Proceedings of RANLP,
2021.
[18] T. Mandl, S. Modha, G. K. Shahi, H. Madhu, S. Satapara, P. Majumder, J. Schäfer, T.
Ranasinghe, M. Zampieri, D. Nandini, A. K. Jaiswal, Overview of the HASOC subtrack at FIRE
2021: Hate Speech and Ofensive Content Identification in English and Indo-Aryan
Languages, in: Working Notes of FIRE 2021 - Forum for Information Retrieval Evaluation,
CEUR, 2021. URL: http://ceur-ws.org/.
[19] A. Kulkarni, M. Mandhane, M. Likhitkar, G. Kshirsagar, J. Jagdale, R. Joshi,
Experimental evaluation of deep learning models for marathi text classification, arXiv preprint
arXiv:2101.04899 (2021).
[20] T. Wolf, J. Chaumond, L. Debut, V. Sanh, C. Delangue, A. Moi, P. Cistac, M. Funtowicz,
J. Davison, S. Shleifer, et al., Transformers: State-of-the-art natural language processing, in:
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing:
System Demonstrations, 2020, pp. 38–45.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>MacAvaney</surname>
          </string-name>
          , H.
          <string-name>
            <surname>-R. Yao</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Russell</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Goharian</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Frieder</surname>
          </string-name>
          ,
          <article-title>Hate speech detection: Challenges and solutions</article-title>
          ,
          <source>PloS one 14</source>
          (
          <year>2019</year>
          )
          <article-title>e0221152</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Matamoros-Fernández</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Farkas</surname>
          </string-name>
          ,
          <article-title>Racism, hate speech, and social media: A systematic review and critique</article-title>
          ,
          <source>Television &amp; New Media</source>
          <volume>22</volume>
          (
          <year>2021</year>
          )
          <fpage>205</fpage>
          -
          <lpage>224</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Banko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>MacKeen</surname>
          </string-name>
          , L.
          <string-name>
            <surname>Ray</surname>
          </string-name>
          ,
          <article-title>A unified taxonomy of harmful content</article-title>
          ,
          <source>in: Proceedings of the fourth workshop on online abuse and harms</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>125</fpage>
          -
          <lpage>137</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. K.</given-names>
            <surname>Scheuerman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Fiesler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Brubaker</surname>
          </string-name>
          ,
          <article-title>Understanding international perceptions of the severity of harmful content online</article-title>
          ,
          <source>PloS one 16</source>
          (
          <year>2021</year>
          )
          <article-title>e0256762</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Schmidt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegand</surname>
          </string-name>
          ,
          <article-title>A survey on hate speech detection using natural language processing</article-title>
          ,
          <source>in: Proceedings of the fith international workshop on natural language processing for social media</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>F.</given-names>
            <surname>Del Vigna12</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cimino23</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Dell'Orletta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Petrocchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Tesconi</surname>
          </string-name>
          ,
          <article-title>Hate me, hate me not: Hate speech detection on facebook</article-title>
          ,
          <source>in: Proceedings of the First Italian Conference on Cybersecurity (ITASEC17)</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>86</fpage>
          -
          <lpage>95</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S. S.</given-names>
            <surname>Aluru</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mathew</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Saha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          ,
          <article-title>Deep learning models for multilingual hate speech detection</article-title>
          , arXiv preprint arXiv:
          <year>2004</year>
          .
          <volume>06465</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>R.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Goel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <article-title>Deep learning for hindi text classification: A comparison</article-title>
          ,
          <source>in: International Conference on Intelligent Human Computer Interaction</source>
          , Springer,
          <year>2019</year>
          , pp.
          <fpage>94</fpage>
          -
          <lpage>101</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kulkarni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mandhane</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Likhitkar</surname>
          </string-name>
          , G. Kshirsagar,
          <string-name>
            <given-names>R.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <article-title>L3cubemahasent: A marathi tweet-based sentiment analysis dataset</article-title>
          ,
          <source>in: Proceedings of the Eleventh Workshop</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>