<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Sexism Identification in Tweets using BERT and XLM -Roberta Notebook for the EXIST Lab at CLEF 2024</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Maha</forename><surname>Usmani</surname></persName>
							<email>mahausmani71@gmail.com</email>
							<affiliation key="aff0">
								<orgName type="department" key="dep1">Computer Science Program</orgName>
								<orgName type="department" key="dep2">Dhanani School of Science and Engineering</orgName>
								<orgName type="institution">Habib University</orgName>
								<address>
									<settlement>Karachi</settlement>
									<country key="PK">Pakistan</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Rania</forename><surname>Siddiqui</surname></persName>
							<affiliation key="aff0">
								<orgName type="department" key="dep1">Computer Science Program</orgName>
								<orgName type="department" key="dep2">Dhanani School of Science and Engineering</orgName>
								<orgName type="institution">Habib University</orgName>
								<address>
									<settlement>Karachi</settlement>
									<country key="PK">Pakistan</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Samin</forename><surname>Rizwan</surname></persName>
							<affiliation key="aff0">
								<orgName type="department" key="dep1">Computer Science Program</orgName>
								<orgName type="department" key="dep2">Dhanani School of Science and Engineering</orgName>
								<orgName type="institution">Habib University</orgName>
								<address>
									<settlement>Karachi</settlement>
									<country key="PK">Pakistan</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Faryal</forename><surname>Khan</surname></persName>
							<affiliation key="aff0">
								<orgName type="department" key="dep1">Computer Science Program</orgName>
								<orgName type="department" key="dep2">Dhanani School of Science and Engineering</orgName>
								<orgName type="institution">Habib University</orgName>
								<address>
									<settlement>Karachi</settlement>
									<country key="PK">Pakistan</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Faisal</forename><surname>Alvi</surname></persName>
							<email>faisal.alvi@sse.habib.edu.pk</email>
							<affiliation key="aff0">
								<orgName type="department" key="dep1">Computer Science Program</orgName>
								<orgName type="department" key="dep2">Dhanani School of Science and Engineering</orgName>
								<orgName type="institution">Habib University</orgName>
								<address>
									<settlement>Karachi</settlement>
									<country key="PK">Pakistan</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Abdul</forename><surname>Samad</surname></persName>
							<email>abdul.samad@sse.habib.edu.pk</email>
							<affiliation key="aff0">
								<orgName type="department" key="dep1">Computer Science Program</orgName>
								<orgName type="department" key="dep2">Dhanani School of Science and Engineering</orgName>
								<orgName type="institution">Habib University</orgName>
								<address>
									<settlement>Karachi</settlement>
									<country key="PK">Pakistan</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Sexism Identification in Tweets using BERT and XLM -Roberta Notebook for the EXIST Lab at CLEF 2024</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">40664F6F7D8916666663E0E9FE0D33EE</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T17:51+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>BERT</term>
					<term>Roberta</term>
					<term>Sexism</term>
					<term>Tweets</term>
					<term>ensemble</term>
					<term>LLM</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The rapid growth of social media platforms has led to an increase in offensive content, often targeting specific demographic groups. This paper focuses on identifying and categorizing sexism in tweets collected from various social media platforms. We address three tasks from the EXIST 2024 lab, involving the classification of tweets in English and Spanish. These tasks include binary classification for sexism identification, source intention categorization of sexist tweets, and multi-label classification for different facets of sexism. Our approach employs BERT multilingual and XLM-RoBERTa models, along with an ensemble technique to enhance prediction accuracy. We evaluate the models using both hard labels, determined by majority vote, and soft labels, based on class probabilities.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>In this paper, we aim to address the first three tasks of the EXIST 2024 lab <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2]</ref>, which involve classifying tweets in English and Spanish. The tasks are as follows:</p><p>Task 1: Sexism Identification in Tweets: Binary classification to determine whether a given tweet is sexist or not.</p><p>Task 2: Source Intention in Tweets: Categorizing messages classified as sexist according to the intention of the author: Direct (intentionally sexist), Reported (reporting a sexist situation), or Judgmental (condemning sexist behaviors).</p><p>Task 3: Sexism Categorization in Tweets: Categorizing sexist tweets into specific categories that represent different facets of sexism: Ideological and Inequality, Stereotyping and Dominance, Objectification, Sexual Violence, and Misogyny and Non-Sexual Violence <ref type="bibr" target="#b2">[3]</ref>.</p><p>The runs are evaluated using hard and soft labels. Hard labels are assigned by majority vote of the voters, while soft labels are the probabilities of each class. Task 1 and 2 are monolabel, hence their probabilities sum to one, while task 3 is multi-label.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Literature Review</head><p>This literature review covers techniques used by the top 8 teams in EXIST 2023. AI-UPV <ref type="bibr" target="#b2">[3]</ref> being one of the teams who participated in EXIST 2023 were ranked 1st in Task 3 with an ICM-Soft score of 0.7879. They employed an ensembled approach using mBERT and XLM-Roberta for multilingual sexism identification across all three tasks. Moreover, Team Mario achieved first place in Tasks 1 and 2, scoring 0.7850 and 0.7764 in ICM-Hard Norm, respectively <ref type="bibr" target="#b3">[4]</ref>. The team utilized GPT-NeoX and BERTIN-GPT-J-6B for multilingual sexism detection, emphasizing efficient multilingual modeling. They fine-tuned GPT-NeoX on task specific data while BERTIN-GPT-J-6B was first fine-tuned on open-source hate-speech dataset then on task specific data. Team Classifiers <ref type="bibr" target="#b4">[5]</ref> secured 2nd place in Task 1 with an ICM-Hard score of 0.7026. They relied on XLMRoBERTa for hard classification in Task 3 and data augmentation for Task 1, showcasing multilingual sexism detection capabilities.</p><p>Team CIC-SDS.KN <ref type="bibr" target="#b5">[6]</ref> was ranked 5th in Task 1 with a ICM-Hard score of 0.7302. They employed the Bernice <ref type="bibr" target="#b6">[7]</ref> model and contrastive learning for multilingual sexism identification, demonstrating effectiveness despite challenges in Task 1. Team UniBo <ref type="bibr" target="#b7">[8]</ref> performed task 1, task 2, and task 3, on detection and categorization of sexism in social networks. Task 1 focused on comparing a hate-tuned Transformer model (RobertaHate) with a multilingual model (XLM-R) that translated Spanish input data into English. The hatetuned model performed better than the multilingual model when translating data into English, indicating the importance of fine-tuning models for specific tasks. For Task 2, the team introduced emotions as additional features using EmoRoBERTa and EmoDistilRoBERTa models. These additional features improved the classification of sexism in Task 2, with EmoRoBERTa providing a slightly better performance boost compared to EmoDistilRoBERTa.</p><p>For the task 3, the team Unibo continued to explore the impact of emotions as additional features. The key findings of this task were that emotions as additional features had a minimal impact on the classification of sexism in Task 3, with EmoRoBERTa providing a slight performance gain. Their ICM-Hard scores for Task 1, 2, and 3 were 0.7089, 0.7316, amd 0.6352 respectively. Team ROH NEIL EXIST2023 achieved 4th place in Task 1 with a score of 0.7353. They used transformer-based models and hyperparameter optimization for multilingual sexism detection and categorization. Team DRIM on the other hand scored 0.5840 (based on soft evaluations) in Task 1. They leveraged BERT models and a meta-model strategy for improved sexism detection and intention identification across Tasks 1, 2, and 3. Lastly, Team AI FHSTP <ref type="bibr" target="#b8">[9]</ref> at EXIST 2023 ranked 19th position in Task 1 with a ICM-Hard score of 0.6739. They combined XLM-RoBERTa with sentiment embeddings and hand-crafted features for multi-task sexism identification and classification</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Dataset</head><p>The dataset contains tweets in both English and Spanish, annotated by six annotators per tweet. For Tasks 1 and 2, each tweet is assigned a single label, representing a binary or categorical classification. In contrast, Task 3 is a multi-label classification problem, where each tweet can be associated with multiple labels. The preprocessing steps to derive both hard and soft labels are detailed in the following section,</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Our Approach</head><p>We have used two models and an ensemble technique. The first model used is BERT multilingual base model (uncased). It is an open source model trained on 102 languages, including English and Spanish with the largest Wikipedia using a masked language modeling (MLM) objective <ref type="bibr" target="#b9">[10]</ref>. The second model used is XLM-RoBERTa which is also a multilingual model trained on 100 different languages <ref type="bibr" target="#b10">[11]</ref>. For task 3, we have also provided an ensemble approach which combined the predictions of both models for soft labels. Both the models are fine tuned on 5 epochs with learning rate 2 × 10 −5 and weight decay of 0.0048, The task wise runs and their corresponding approaches are detailed in the following subsection:</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Task 1: Binary Classification</head><p>Task 1 was a binary classification problem. We submitted two runs for this task, both using hard labels:</p><p>• Run 1: Utilized the "bert-multilingual-uncased" model.</p><p>• Run 2: Utilized the "xlm-roberta" model.</p><p>For preprocessing, we considered a threshold of 3 for the number of annotators. If a tweet was labeled as sexist ("YES") by more than 3 annotators, it was assigned a label of 1; otherwise, it was assigned a label of 0.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Task 2: Source Intent Identification</head><p>Task 2 involved categorizing tweets according to the author's intent. The preprocessing involved the following steps:</p><p>• Assign-Majority-Label Function determined the majority label among multiple annotators for each data point, filtering out those that did not meet a minimum threshold of agreement (in this case, at least 2 annotators). • Transform Function assigned numeric values to textual labels, mapping "DIRECT" to 1, "RE-PORTED" to 2, "JUDGMENTAL" to 3, and all other labels to 0. • Soft labels were obtained by calculating the probability of each class.</p><p>We submitted four runs for this task, utilizing both hard and soft labels with BERT and XLM-RoBERTa models. The results are shown in Tables <ref type="table" target="#tab_2">2, 3</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Task 3: Multi-Label Classification</head><p>Task 3 involved categorizing tweets based on sexism. Similar to the previous tasks, hard labels were obtained using the assign-majority-label function. This function classified the tweets into corresponding labels, returning either a single label or a list of labels depending on the outcome of the filtering process. The threshold for this task was set to 1, meaning a label was added if more than one annotator categorized a tweet accordingly. The Transform Function transformed a list of labels into a corresponding list of numeric values based on specific label mappings. For soft labels, we calculated the probability of each class.</p><p>We submitted three runs for this task:</p><p>• Run 1: Utilized the BERT model.</p><p>• Run 2: Utilized the XLM-RoBERTa model.</p><p>• Run 3: Employed an ensemble approach where XLM-RoBERTa and BERT were first trained independently on our datasets, and the ensemble model combined the predictions from both models to make the final prediction.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Results</head><p>Tables 1,2,3,4,5 describe the results of all the runs. ES and EN refer to Spanish and English dataset respectively. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Analysis</head><p>In comparing our results to the rest of the participants, our findings demonstrate significant improvements in several key areas. In all tasks, our approach performs better on the Spanish dataset than the English dataset. The best results are obtained in source identification, with BERT obtaining 8th rank on soft-soft labels. Specifically, for Task 1, our model using RoBERTa (ALL) achieved the highest F1-score of 0.7462, outperforming all other participants' models. RoBERTa (ALL) also demonstrated superior performance in both the ICM-Hard and ICM-Hard Norm metrics with scores of 0.4398 and 0.7211, respectively, indicating robust and consistent results across both English and Spanish datasets.</p><p>In Tasks 1 and 2, BERT performs much better than RoBERTa, even though the training parameters are the same in both models. For Task 2, our BERT (ES) model stood out with an ICM-Hard score of 0.2306 and an F1-score of 0.5293, surpassing other participants' models in the same category. In Task 3, the ensemble approach performs better than both models in hard-hard evaluation, while RoBERTa outperforms BERT and the ensemble in soft-soft evaluation.</p><p>One possible explanation for the difference in performance across languages and tasks could be due to how the models interact with the linguistic characteristics of the datasets. The Spanish dataset might contain features that BERT and the ensemble can better capture, while the English dataset might have complexities better handled by RoBERTa. Additionally, the ensemble's performance in hard-hard evaluation suggests that combining BERT and RoBERTa takes advantage of the strengths of both models for better generalization. Our results underscore the effectiveness and reliability of our approach, particularly in the context of the challenging tasks and datasets involved.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">Conclusion</head><p>In this study, we explored the performance of large language models (LLMs) such as BERT and RoBERTa on multiple tasks in sexism detection. We found that LLMs perform quite well on these tasks, but there is a large variation in performance depending upon the language and the evaluation approach used. It is very clear that BERT outperformed RoBERTa in several tasks, while the ensemble approach showed the potential for improved generalization by combining the strengths of both models. Overall, our results demonstrate that LLMs have a powerful ability to solve complex language processing tasks and can be used as one of the effective approaches in practice to build robust solutions for classifying and addressing sexism in text.</p><p>Our findings specifically highlighted that BERT achieved superior results, particularly in the Spanish dataset, suggesting that language-specific nuances play a significant role in model performance. The ensemble approach's consistent success in certain evaluations indicates that integrating multiple models can mitigate individual weaknesses and enhance overall robustness. These insights emphasize the importance of selecting appropriate models and combining techniques to address varied linguistic challenges in text classification tasks.</p><p>For future work, we aim to explore more advanced ensemble techniques, such as boosting and bagging, to further improve the performance of sexism detection across different languages and tasks. Additionally, we plan to integrate additional contextual embeddings and examine how the size and quality of the dataset affect model performance, which could offer valuable insights for developing improved training strategies. Expanding the scope of our datasets and refining our evaluation metrics will also be crucial steps in ensuring that our models are not only accurate but also adaptable to real-world applications in diverse linguistic contexts.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Task 1: Hard-Hard Labels for the Spanish and English Datasets Using BERT and RoBERTa</figDesc><table><row><cell>RUN</cell><cell cols="3">ICM -Hard ICM -Hard Norm F1-score</cell></row><row><cell>BERT (ALL)</cell><cell>0.3961</cell><cell>0.6991</cell><cell>0.7194</cell></row><row><cell cols="2">RoBERTa (ALL) 0.4398</cell><cell>0.7211</cell><cell>0.7462</cell></row><row><cell>BERT (ES)</cell><cell>0.4136</cell><cell>0.7068</cell><cell>0.7463</cell></row><row><cell>RoBERTa (ES)</cell><cell>0.4253</cell><cell>0.7127</cell><cell>0.7595</cell></row><row><cell>BERT (EN)</cell><cell>0.3587</cell><cell>0.6831</cell><cell>0.6821</cell></row><row><cell>RoBERTa (EN)</cell><cell>0.4395</cell><cell>0.7243</cell><cell>0.7280</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>Task 2: Hard-Hard Labels for the Spanish and English Datasets Using BERT and RoBERTa</figDesc><table><row><cell>RUN</cell><cell cols="3">ICM -Hard ICM -Hard Norm F1-score</cell></row><row><cell>BERT (ALL)</cell><cell>0.1609</cell><cell>0.5523</cell><cell>0.4978</cell></row><row><cell cols="2">RoBERTa (ALL) -0.9078</cell><cell>0.2048</cell><cell>0.1899</cell></row><row><cell>BERT (ES)</cell><cell>0.2306</cell><cell>0.5720</cell><cell>0.5293</cell></row><row><cell>RoBERTa (ES)</cell><cell>-0.9850</cell><cell>0.1923</cell><cell>0.1850</cell></row><row><cell>BERT (EN)</cell><cell>0.0621</cell><cell>0.5215</cell><cell>0.4553</cell></row><row><cell>RoBERTa (EN)</cell><cell>-0.8242</cell><cell>0.2148</cell><cell>0.1951</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3</head><label>3</label><figDesc>Task 2: Soft-Soft Labels for the Spanish and English Datasets Using BERT and RoBERTa</figDesc><table><row><cell>RUN</cell><cell cols="2">ICM -Soft ICM -Soft Norm</cell></row><row><cell>BERT (ALL)</cell><cell>-2.1737</cell><cell>0.3249</cell></row><row><cell cols="2">RoBERTa (ALL) -6.9170</cell><cell>0.0000</cell></row><row><cell>BERT (ES)</cell><cell>-1.7710</cell><cell>0.3582</cell></row><row><cell>RoBERTa (ES)</cell><cell>-6.6587</cell><cell>0.0000</cell></row><row><cell>BERT (EN)</cell><cell>-2.8802</cell><cell>0.2646</cell></row><row><cell>RoBERTa (EN)</cell><cell>-7.5545</cell><cell>0.0000</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 4</head><label>4</label><figDesc>Task 3: Hard-Hard Labels for the Spanish and English Datasets Using BERT and RoBERTa</figDesc><table><row><cell>RUN</cell><cell cols="3">ICM -Hard ICM -Hard Norm F1-score</cell></row><row><cell>BERT (ALL)</cell><cell>-1.7482</cell><cell>0.0941</cell><cell>0.1700</cell></row><row><cell cols="2">RoBERTa (ALL) -1.6017</cell><cell>0.1281</cell><cell>0.1069</cell></row><row><cell cols="2">Ensemble (ALL) -1.5952</cell><cell>0.1296</cell><cell>0.1087</cell></row><row><cell>BERT (ES)</cell><cell>-1.7645</cell><cell>0.1060</cell><cell>0.1588</cell></row><row><cell>RoBERTa (ES)</cell><cell>-1.7289</cell><cell>0.1140</cell><cell>0.1030</cell></row><row><cell>Ensemble (ES)</cell><cell>-1.7229</cell><cell>0.1153</cell><cell>0.1061</cell></row><row><cell>BERT (EN)</cell><cell>-1.7214</cell><cell>0.0781</cell><cell>0.1816</cell></row><row><cell>RoBERTa (EN)</cell><cell>-1.4614</cell><cell>0.1418</cell><cell>0.1111</cell></row><row><cell>Ensemble (EN)</cell><cell>-1.4543</cell><cell>0.1436</cell><cell>0.1111</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 5</head><label>5</label><figDesc>Task 3: Soft-Soft Labels for the Spanish and English Datasets Using BERT and RoBERTa</figDesc><table><row><cell>RUN</cell><cell cols="2">ICM -Soft ICM -Soft Norm</cell></row><row><cell>BERT (ALL)</cell><cell>-8.2508</cell><cell>0.0643</cell></row><row><cell cols="2">RoBERTa (ALL) -8.4277</cell><cell>0.0550</cell></row><row><cell cols="2">Ensemble (ALL) -8.4277</cell><cell>0.0550</cell></row><row><cell>BERT (ES)</cell><cell>-7.7274</cell><cell>0.0978</cell></row><row><cell>RoBERTa (ES)</cell><cell>-8.7035</cell><cell>0.0470</cell></row><row><cell>Ensemble (ES)</cell><cell>-8.7035</cell><cell>0.0470</cell></row><row><cell>BERT (EN)</cell><cell>-8.9622</cell><cell>0.0090</cell></row><row><cell>RoBERTa (EN)</cell><cell>-7.9811</cell><cell>0.0627</cell></row><row><cell>Ensemble (EN)</cell><cell>-7.9811</cell><cell>0.0627</cell></row></table></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="8.">Acknowledgments</head><p>The authors would like to acknowledge the support provided by the Office Of Research (OoR) at Habib University, Karachi, Pakistan for funding this project through internal research grant IRG-2235.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Overview of EXIST 2024 -Learning with Disagreement for Sexism Identification and Characterization in Social Networks and Memes</title>
		<author>
			<persName><forename type="first">L</forename><surname>Plaza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Carrillo-De-Albornoz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Ruiz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Maeso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Chulvi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Amigó</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gonzalo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Morante</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Spina</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fifteenth International Conference of the CLEF Association</title>
				<meeting><address><addrLine>CLEF</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2024">2024. 2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Overview of EXIST 2024 -Learning with Disagreement for Sexism Identification and Characterization in Social Networks and Memes (Extended Overview)</title>
		<author>
			<persName><forename type="first">L</forename><surname>Plaza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Carrillo-De-Albornoz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Ruiz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Maeso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Chulvi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Amigó</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gonzalo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Morante</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Spina</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Working Notes of CLEF 2024 -Conference and Labs of the Evaluation Forum</title>
				<editor>
			<persName><forename type="first">G</forename><surname>Faggioli</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Ferro</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Galuščáková</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><forename type="middle">G S</forename><surname>Herrera</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">F M</forename><surname>De Paula</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Rizzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Fersini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Spina</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2307.03385</idno>
		<title level="m">Ai-upv at exist 2023-sexism characterization using large language models under the learning with disagreements regime</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Efficient multilingual sexism detection via large language models cascades</title>
		<author>
			<persName><forename type="first">L</forename><surname>Tian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zhang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Working Notes of CLEF</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<author>
			<persName><forename type="first">G</forename><surname>Radler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">I</forename><surname>Ersoy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Carpentieri</surname></persName>
		</author>
		<title level="m">Classifiers at exist 2023: Detecting sexism in spanish and english tweets with xlm-t</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note>Working Notes of CLEF</note>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Multilingual sexism identification using contrastive learning</title>
		<author>
			<persName><forename type="first">J</forename><surname>Angel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Aroyehun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gelbukh</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="s">Working Notes of CLEF</title>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Bernice: A multilingual pretrained encoder for twitter</title>
		<author>
			<persName><forename type="first">A</forename><surname>Delucia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mueller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Aguirre</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Resnik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Dredze</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2022 conference on empirical methods in natural language processing</title>
				<meeting>the 2022 conference on empirical methods in natural language processing</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="6191" to="6205" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Enriching hate-tuned transformer-based embeddings with emotions for the categorization of sexism</title>
		<author>
			<persName><forename type="first">A</forename><surname>Muti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Mancini</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CEUR WORKSHOP PROCEEDINGS</title>
				<imprint>
			<publisher>CEUR-WS</publisher>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">3497</biblScope>
			<biblScope unit="page" from="1012" to="1023" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Böck</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Schütz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Liakhovets</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">Q</forename><surname>Satriani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Babic</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Slijepčević</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zeppelzauer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Schindler</surname></persName>
		</author>
		<title level="m">Ait_fhstp at exist 2023 benchmark: sexism detection by transfer learning, sentiment and toxicity embeddings and hand-crafted features</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note>Working Notes of CLEF</note>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M.-W</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Toutanova</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1810.04805</idno>
		<title level="m">Bert: Pre-training of deep bidirectional transformers for language understanding</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><surname>Conneau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Khandelwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Chaudhary</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Wenzek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Guzmán</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Grave</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ott</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Stoyanov</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1911.02116</idno>
		<title level="m">Unsupervised cross-lingual representation learning at scale</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
