<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Context-aware Language Modeling for Arabic Misogyny Identification</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Istabrak Abbes</string-name>
          <email>istabrak.abbes@etudiant-enit.utm.tn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eya Nakache</string-name>
          <email>eya.nakache@etudiant-enit.utm.tn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Moez BenHajHmida</string-name>
          <email>moez.benhajhmida@enit.utm.tn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Natural Language Processing</institution>
          ,
          <addr-line>classification, BERT, MarBERT, misogyny, Arabic Dialects</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Tunis El Manar</institution>
          ,
          <addr-line>Campus Universitaire El Manar, Le Belvedere, 1002</addr-line>
          ,
          <country country="TN">Tunisia</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <fpage>13</fpage>
      <lpage>17</lpage>
      <abstract>
        <p>In this paper, we describe our eforts on the shared task of Arabic Misogyny Identification (ArMI) [ 1]. We tackled the Misogyny Content Identification subtask (Subtask-1). Our experiments were based on preprocessing the given data, then fine-tuning pretrained MARBERT language model on the Misogyny Identification downstream task. Experimental results performed only on Subtask-1, show that keeping emojis in text can influence the model.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>preprocessing strategy.</p>
      <p>The rest of the paper is organized as follows: in Section 2 we introduce ArMI dataset. In
Section 3, we describe our approach in tackling the problem. Then, in Section 4 we provide
and discuss the results of the proposed method on Subtask-1. Section 5 concludes our work
throughout this shared task.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Data</title>
      <p>
        The released train dataset [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] for the ArMi competition [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] is the same for both shared subtasks.
The organizers provided a dataset containing 7866 tweets for training and tweets for testing.
The dataset was annotated for misogyny detection task (Subtask-1) with the label ”misogyny”
for misogynistic tweets and ”none” for non misogynistic tweets. Table 1 presents statistics of
the train dataset for Subtask-1. For the second shared task (Subtask-2) on Misogyny Behavior
Identification the labels are (Damning, Derailing, Discredit, Dominance, Sexual Harassment,
Stereotyping and Objectification, Threat of Violence or None) for respectively:
• Damning (Damn): tweets under this class contain cursing content.
• Derailing (Der): tweets under this class combine justification of women abuse or
mistreatment.
• Discredit (Disc): tweets under this class bear slurs and ofensive language against women.
• Dominance (Dom): tweets under this class imply the superiority of men over women.
• Sexual Harassment (Harass): tweets under this class describe sexual advances and sexual
nature abuse.
• Stereotyping and Objectification (Obj): tweets under this class promote a fixed image of
women or describe women’s physical appeal.
• Threat of Violence (Vio): tweets under this class have an intimidating content with threats
of physical violence.
      </p>
      <p>• None: if no misogynistic behaviors exist.</p>
    </sec>
    <sec id="sec-3">
      <title>3. System</title>
      <p>This section describes the various data preparation procedures and models utilized in the
experiments.</p>
      <sec id="sec-3-1">
        <title>3.1. Preprocessing</title>
        <p>
          To prepare the dataset for preprocessing, we completed 4 main stages as follows:
• Cleaning: we removed all of the diacritics such as (damma,tashdid, fatha, kasra, etc.),
English words and numbers, English and Arabic punctuations, URLS and USER mention
tokens.
• Elongation removal: any repeated character for more than twice was removed. For
example, the word “دددديكا ” becomes “ديكا” after the preprocessing.
• Letter normalisation: Arabic characters that appeared in a variety of forms were
combined into a single form. For example, letter like آ , إ, and أ are replaced with a ا
• Extract hashtag keywords: To extract intelligible key phrases, we deleted the hash
symbol ”#” and replaced the underscore ”_” within a hashtag with a white space. For
exemple ” دمحأ_ اي_ لغتشتب# ” becomes ” دمحأ اي لغتشتب.
3.2. Model
In a complex NLP task like misogyny identification, we need context-aware embedding tools.
BERT [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] and derivatives language models provide powerful contextualized embedding.
        </p>
        <p>
          Recently, some works focused on Arabic language and dialects. We cite AraBERT [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], ARBERT
[
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], and MARBERT [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. AraBERT and ARBERT are built on pure MSA datasets, while MARBERT
focuses on dialectal Arabic. MARBERT is pre-trained on 1 billion Arabic MSA and DA tweets.
In [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] MARBERT showed better performances compared to AraBERT and ARBERT on dialectal
Arabic. Thereafter, we choose to use the language model MARBERT to build our classification
Model.
        </p>
        <p>After preprocessing the ArMI dataset as described above, we split the dataset into 90% for
training and 10% for validation. We have trained MARBERT on the 90% split with a batch size
of 32 and a sequence length of 128 for 5 epochs.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results and Discussion</title>
      <p>In this section, we present and discuss the results of our experiments on Subtask-1.</p>
      <sec id="sec-4-1">
        <title>4.1. Results on Subtask-1</title>
        <p>The evaluation metric used to test our system on Subtask-1 is the accuracy. This metric was
specified by the competition organizers. Table 2 lists the results of the classifier built by
finetuning MARBERT on the training set and tested on the validation set.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Oficial Results</title>
        <p>Table 3 lists the results performed by our model on the test set as provided by the competition
organizers. We observe a large drop in model performance on the test set, which is most likely</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Error Analysis</title>
        <p>We performed extra error analysis for our proposed model results. This analysis aims to find
where the model failed to correctly categorize the tweets and tries to discover the causes of this
misclassification.</p>
        <p>We examine a sample of 50 random misclassified tweets. We discovered various reasons why
sarcastic tweets are classified as not misogyny and vice versa. These reasons are summarized
as follows:
• Human annotation is not perfect since the diversity of annotators’ cultures and
backgrounds may not be taken into account throughout the annotation process.
• The absence of context: In some tweets, the context is missing, making it impossible
for our algorithm to grasp the context and accurately forecast the label. Actually, Arabs
use some highly ofensive words not with the intention to spread hate but by sarcasm.
• Emojis are not processed: since we left the emojis in tweets without any kind of
preprocessing, we discovered that some emojis had an efect on the categorization process.</p>
        <p>Actually, 14% of the tweets contain at least one emoji.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>To identify misogyny on Dialectal Arabic tweets we first proposed a preprocessing strategy.
Secondly, we built a classification model based on MARABERT language model which was
selected for the final submission. We observed that considering emojis in the preprocessing
step is crucial to the classification performance.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>H.</given-names>
            <surname>Mulki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ghanem</surname>
          </string-name>
          , ArMI at FIRE2021:
          <article-title>Overview of the First Shared Task on Arabic Misogyny Identification</article-title>
          , in: Working Notes of FIRE 2021 -
          <article-title>Forum for Information Retrieval Evaluation</article-title>
          ,
          <string-name>
            <surname>CEUR</surname>
          </string-name>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Abdul-Mageed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Elmadany</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. M. B.</given-names>
            <surname>Nagoudi</surname>
          </string-name>
          ,
          <string-name>
            <surname>ARBERT</surname>
          </string-name>
          &amp;
          <article-title>MARBERT: Deep bidirectional transformers for Arabic, in: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th</article-title>
          <source>International Joint Conference on Natural Language Processing</source>
          (Volume
          <volume>1</volume>
          :
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          ,
          <source>Association for Computational Linguistics</source>
          , Online,
          <year>2021</year>
          , pp.
          <fpage>7088</fpage>
          -
          <lpage>7105</lpage>
          . URL: https://aclanthology.org/
          <year>2021</year>
          .
          <article-title>acl-long</article-title>
          .
          <volume>551</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>K. L.</given-names>
            <surname>Jacob</surname>
          </string-name>
          <string-name>
            <given-names>Devlin</given-names>
            ,
            <surname>Ming-Wei Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , BERT:
          <article-title>Pre-training of deep bidirectional transformers for language understanding, in: 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</article-title>
          , Volume
          <volume>1</volume>
          (Long and Short Papers), Minneapolis, Minnesota,
          <year>2019</year>
          , pp.
          <fpage>4171</fpage>
          -
          <lpage>4186</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>H.</given-names>
            <surname>Mulki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ghanem</surname>
          </string-name>
          , Let-Mi:
          <article-title>An Arabic Levantine Twitter Dataset for Misogynistic Language</article-title>
          ,
          <source>in: Proceedings of the 6th Arabic Natural Language Processing Workshop (WANLP</source>
          <year>2021</year>
          ),
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>F. B.</given-names>
            <surname>Wissam Antoun</surname>
          </string-name>
          , H. Hajj,
          <article-title>AraBERT: Transformer-based model for Arabic language understanding</article-title>
          ,
          <source>in: 2019 Conference of the North American Chapter of thethe 4th Workshop on Open-Source Arabic Corpora and Processing Tools</source>
          ,
          <article-title>with a Shared Task on Ofensive Language Detection</article-title>
          , Marseille, France,
          <year>2020</year>
          , pp.
          <fpage>9</fpage>
          -
          <lpage>15</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>