<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">From Explanation to Detection: Multimodal Insights into Disagreement in Misogynous Memes</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Giulia</forename><surname>Rizzi</surname></persName>
							<email>g.rizzi10@campus.unimib.it</email>
							<affiliation key="aff0">
								<orgName type="institution">University of Milano-Bicocca</orgName>
								<address>
									<settlement>Milan</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="institution">Universitat Politècnica de València</orgName>
								<address>
									<settlement>Valencia</settlement>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Paolo</forename><surname>Rosso</surname></persName>
							<email>prosso@dsic.upv.es</email>
							<affiliation key="aff1">
								<orgName type="institution">Universitat Politècnica de València</orgName>
								<address>
									<settlement>Valencia</settlement>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Elisabetta</forename><surname>Fersini</surname></persName>
							<email>elisabetta.fersini@unimib.it</email>
							<affiliation key="aff0">
								<orgName type="institution">University of Milano-Bicocca</orgName>
								<address>
									<settlement>Milan</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff2">
								<orgName type="department">Tenth Italian Conference on Computational Linguistics</orgName>
								<address>
									<addrLine>Dec 04 -06</addrLine>
									<postCode>2024</postCode>
									<settlement>Pisa</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">From Explanation to Detection: Multimodal Insights into Disagreement in Misogynous Memes</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">C362E1D9E94083E2570A8A11F658C37C</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T17:38+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Disagreement</term>
					<term>Perspectivism</term>
					<term>Multimodal</term>
					<term>Misogyny</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Warning: This paper contains examples of language and images that may be offensive. This paper presents a probabilistic approach to identifying the disagreement-related elements in misogynistic memes by considering both modalities that compose a meme (i.e., visual and textual sources). Several methodologies to exploit such elements in the identification of disagreement among annotators have been investigated and evaluated on the Multimedia Automatic Misogyny Identification (MAMI) [1] dataset. The proposed unsupervised approach reaches comparable performances, and in some cases even better, with state-of-the-art approaches, but with a reduced number of parameters to be estimated. The source code of our approaches is publicly available † .</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Hate detection has been a serious concern in recent years, penetrating internet platforms and causing harm to individuals across various communities. Users found in the online environment new modes of representation to express various types of hatred, including the more deeply rooted ideologies and beliefs with historical origins, for example towards women <ref type="bibr" target="#b1">[2]</ref>. Detecting abusive language has become an increasingly important task. The challenges introduced by the new modes of representation, which require a multimodal analysis, are further compounded when considering the subjectivity of the task. The subjectivity of the task derives from the fact that individuals' perception of what characterizes a message of hate varies widely. Such diversification is reflected in the labeling phase in the form of disagreement among annotators. Identifying elements within the sample that can lead to disagreement is of paramount importance for several reasons. For content that can lead to disagreement, specific annotation policies might be introduced, and the number of annotators might be enlarged to capture multiple perspectives <ref type="bibr" target="#b2">[3,</ref><ref type="bibr" target="#b3">4,</ref><ref type="bibr" target="#b4">5]</ref>. In this work, we propose a methodology to identify the disagreement-related elements in multimodal samples by exploring both visual and textual elements in the</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Works</head><p>Many natural language tasks, such as hate speech detection, humor detection, and sentiment analysis, involve subjectivity since they require an interpretation based on human judgment, cultural context, or personal opinion <ref type="bibr" target="#b5">[6]</ref>. Such phenomenon is reflected in the dataset through multiple labels from different annotators or via the inclusion of a confidence level to ground truth labels. Labels derived from different interpretations are therefore able to capture multiple perspectives and understandings <ref type="bibr" target="#b5">[6]</ref>. Information about annotators' disagreement has primarily been exploited as a means to improve data quality by excluding controversial instances <ref type="bibr" target="#b6">[7,</ref><ref type="bibr" target="#b7">8]</ref>. Alternatively, aiming at improving model performances, different strategies have been developed to exploit disagreement information in the training phase. For instance, in <ref type="bibr" target="#b8">[9]</ref>, the authors assign weights to instances to prioritize the ones with higher confidence levels. Another commonly adopted strategy <ref type="bibr" target="#b5">[6,</ref><ref type="bibr" target="#b9">10]</ref> aims at directly learning from disagreement without considering any aggregated label. While a considerable amount of research has been conducted to understand the reasons behind annotators' disagreement <ref type="bibr" target="#b10">[11,</ref><ref type="bibr" target="#b11">12,</ref><ref type="bibr" target="#b7">8]</ref> and to leverage disagreement when training classification models <ref type="bibr" target="#b12">[13,</ref><ref type="bibr" target="#b13">14,</ref><ref type="bibr" target="#b14">15,</ref><ref type="bibr" target="#b15">16,</ref><ref type="bibr" target="#b16">17,</ref><ref type="bibr" target="#b17">18,</ref><ref type="bibr" target="#b18">19]</ref>, there has been comparatively little attention devoted to the explanation and a priori recognition of disagreement in hateful content. A taxonomy of possible reasons leading to annotators' dis-agreement has been proposed by <ref type="bibr" target="#b11">[12]</ref>. Such taxonomy articulates four macro categories of reasons behind disagreement: sloppy annotations, ambiguity, missing information, and subjectivity. Moreover, the authors evaluate the impact on classification performance of the different types.</p><p>Only recently, works have focused on the task of explaining disagreement <ref type="bibr" target="#b19">[20,</ref><ref type="bibr" target="#b20">21,</ref><ref type="bibr" target="#b21">22,</ref><ref type="bibr" target="#b22">23]</ref>. In <ref type="bibr" target="#b20">[21]</ref>, the authors propose exploratory text visualization techniques as a method for analyzing different perspectives from annotated data. In <ref type="bibr" target="#b21">[22]</ref>, the authors identify textual constituents that contribute to hateful message explanation by exploiting integrated gradients within a filtering strategy. A more recent approach <ref type="bibr" target="#b22">[23]</ref> proposes a probabilistic semantic approach for the identification of disagreementrelated constituents (e.g. textual elements) in hateful content. Overall, the findings indicate that, while LLM can yield promising results, comparable outcomes can be attained with less complex strategies and fewer computational resources. While previous research has concentrated on the analysis of textual disagreement, this study represents, to the best of our knowledge, a first insight into the explanation of multimodal disagreement. In particular, we have revised and extended to the multimodal environment the methodology proposed in <ref type="bibr" target="#b22">[23]</ref> in order to consider not only textual elements but also visual ones.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Proposed Approach</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Identification of Disagreement-Related Elements</head><p>The first phase of the proposed approach aims to evaluate the relationship between elements (both visual and textual) that compose a meme and annotators' disagreement. Preliminary preprocessing operations have been performed before identifying disagreement-related elements. For what concerns the textual components, preprocessing operations have been performed (i.e., tokenization, lemmatization, lower casing and stop word removal) to identify a valid set of tokens 1 that might be related to disagreement. Considering the image component, the set of 14 human readable concepts (tags) identified by <ref type="bibr" target="#b23">[24]</ref> to capture specific characteristics of misogynous content has been adopted. As proposed by the authors, tags were extracted via the Clarifai API <ref type="bibr" target="#b24">[25]</ref>.</p><p>The preprocessing steps allowed us to extract a list of visual and textual elements from each meme in the dataset.</p><p>In order to measure the relationship among each element in the memes and the disagreement among annotators, the approach proposed in <ref type="bibr" target="#b22">[23]</ref> has been extended 1 To guarantee a more robust evaluation, tokens that appear less than 10 times in the dataset have been removed. to a multimodal scenario. In particular, <ref type="bibr" target="#b22">[23]</ref> introduces a methodology to identify disagreement related constituents that, however, is limited to textual content. The approach includes a strategy to identify disagreementrelated textual constituents and an approach for generalization towards unseen textual constituents. Both methods have been extended to a multimodal scenario in order to identify disagreement related elements both in textual and visual sources that compose a meme.</p><p>Given an element 𝑒, a corresponding Element Disagreement Score ( EDS(e)) has been computed according to the following equation:</p><formula xml:id="formula_0">𝐸𝐷𝑆(𝑒) = 𝑃 (𝐴𝑔𝑟𝑒𝑒|𝑒) − 𝑃 (¬𝐴𝑔𝑟𝑒𝑒|𝑒)<label>(1)</label></formula><p>where 𝑃 (𝐴𝑔𝑟𝑒𝑒|𝑒) represents the conditional probability that there is agreement on a meme given that the meme contains the element 𝑒. Analogously, 𝑃 (¬𝐴𝑔𝑟𝑒𝑒|𝑒) denotes the conditional probability that there is no agreement on a meme given that, that meme, contains the element 𝑒. Given that EDS represents a difference between two complementary probabilities, it is bounded within the range of -1 to +1. A higher positive score indicates stronger agreement between annotators, whereas a lower negative score suggests disagreement.</p><p>The score can be estimated on the training data and exploited to identify additional disagreement-related elements on unseen memes.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Disagreement identification</head><p>Once the Element Disagreement Scores have been estimated for each visual and textual element in the training dataset, they can be exploited to qualify the level of disagreement on unseen samples. Analogously to what was carried out in <ref type="bibr" target="#b22">[23]</ref>, different aggregation strategies have been investigated, relying on the hypothesis that the identified elements can be exploited for identifying the disagreement thanks to their different distribution in samples with and without an agreement.</p><p>For each meme in the test set, the corresponding list of elements and the corresponding Elements Disagreement Score estimated on the training data have been extracted. In particular, for each meme, the textual and visual elements have been identified and paired with the corresponding score when available. The Multimodal Disagreement Score (MDS) has been estimated according to the following strategies: Sum, Mean, Median, and Minimum. A threshold 𝜏 has been estimated according to a grid-search approach for each strategy.</p><p>A qualitative evaluation, comprehensive of a comparison with the specific misogynistic terminology and an evaluation of the keyword included in the dataset creation phase, has been performed to assess the quality of the EDS, while both the F1-score for the two considered classes (agreement (+) and disagreement (-)) and a global F1-score have been computed to validate the MDS.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Generalization towards unseen elements</head><p>The score estimation is strongly based on what is observed in the training data, resulting in the lack of scores for any elements that do not appear in the training samples. This is particularly relevant for textual components rather than visual ones. In fact, while we can assume an open-word vocabulary (where a few terms on unseen data can not appear in the training set) for the textual source, we limited the visual tags to closed-word settings (only 14 tags can be considered both in training and unseen memes). Since we need to generalize only on unseen textual constituents, for each (unseen) textual element 𝑒 ˆ, an approximated EDS score has been computed as follows:</p><p>• Embeddings of the training lexicon: the contextualized embedding representation of each textual element 𝑒 has been obtained via mBert <ref type="bibr" target="#b25">[26]</ref>.</p><p>An average embedding vector representation x ⃗ 𝑒 is computed to jointly represent multiple embedding representations of 𝑒 derived by the different contexts where it occurs. In particular, given an element 𝑒 and 𝑁 sentences containing it, its vector representation x ⃗ 𝑒 is obtained by a simple aver-</p><formula xml:id="formula_1">age x ⃗ 𝑒 = 𝑁 ∑︀ 𝑖=1 v ⃗ 𝑖/𝑁</formula><p>, where v ⃗ 𝑖 is the constituent contextualized embedding vector related to the 𝑖 𝑡ℎ occurrence of 𝑒 and obtained through mBert. • Embeddings of unseen term: given an unseen textual element 𝑒 ˆwithin a given sentence, its contextualized embedding representation has been computed via mBert <ref type="bibr" target="#b25">[26]</ref>.</p><p>• Most similar constituent: given an unseen textual element 𝑒 ˆwith the corresponding embedding v ⃗ 𝑒 ^and the average embedding of a training element 𝑒, the set 𝐷 of most similar constituents to 𝑒 ˆis determined according to:</p><formula xml:id="formula_2">𝐷 = ⋃︁ 𝑒 {𝑒|𝑐𝑜𝑠(x ⃗ 𝑒, v ⃗ 𝑒 ^) ≤ 𝜓}<label>(2)</label></formula><p>where</p><formula xml:id="formula_3">𝑐𝑜𝑠(x ⃗ 𝑒, v ⃗ 𝑒 ^)</formula><p>is the cosine similarity between the average contextualized embedding representation of element 𝑒 and 𝑒 ˆ, and 𝜓 is a grid search estimated threshold.</p><p>• Unseen terms score: the EDS score for an unseen textual element 𝑒 ˆis computed as the weighted average of the most similar constituents 𝑒 of the training lexicon:</p><formula xml:id="formula_4">𝐸𝐷𝑆(𝑒 ˆ) = ∑︀ 𝑒∈𝐷 [𝑐𝑜𝑠(𝑒, 𝑒 ˆ) • 𝐸𝐷𝑆(𝑒)] ∑︀ 𝑒∈𝐷 𝑐𝑜𝑠(𝑒, 𝑒 ˆ)<label>(3)</label></formula><p>• Multimodal Disagreement Score with unseen constituents: All the above-proposed strategies for MDS estimation have been extended to also include elements that do not belong to the training lexicon and for which the EDS score has been estimated. In particular, given a multimodal sample 𝑠, the aggregation functions presented in Section 3.2 will in this case consider the 𝐸𝐷𝑆 values of both seen (by considering the 𝐸𝐷𝑆(𝑒)) and unseen (by considering the 𝐸𝐷𝑆(𝑒 ˆ)) elements. Such generalized aggregation functions will be later referred to through the prefix 𝐺−.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Results</head><p>The proposed approach has been evaluated on the Multimedia Automatic Misogyny Identification (MAMI) Dataset <ref type="bibr" target="#b0">[1]</ref> consisting of 10.000 memes for training and 1.000 memes for testing <ref type="foot" target="#foot_0">2</ref> . The dataset comprises a range of memes that exemplify various forms of misogyny, including shaming, stereotyping, objectification, and violence. Each meme has been labeled by three crowdsourced annotators for misogynistic content <ref type="foot" target="#foot_1">3</ref> , with an estimated Fleiss-K <ref type="bibr" target="#b26">[27]</ref> coefficient equal to 0.5767. In particular, the proposed approach has been adopted to estimate an Element Disagreement Score (EDS) for each element and, consequently, MDS for each meme in the dataset.</p><p>Table <ref type="table" target="#tab_0">1</ref> reports the top-10 highest positive and highest negative disagreement scores derived for the textual component. We can notice how terms that are rarely linked with misogynistic messages (e.g., flu) and terms commonly used to address women in a harmful way (e.g., whale) also exploiting stereotypes (e.g. gamer and programmer), achieve a high positive score, indicating a strong relation with the agreement. Additionally, some personal names of famous people (i.e., Bernie and Miley) appear within the ranking. In particular, such names  Terms with the highest positive and lowest negative scores might appear in memes as the target of a hateful message, referring to their personal life, physical appearance, or specific events that involved them. As a consequence, depending on the reasons that lead to such criticism (gender, physical appearance, and personal choices for Miley Cyrus vs. political stance and career, without the same gendered connotations, for Bernie Sanders) there might be disagreement about misogyny. Table <ref type="table">2</ref> reports the top-5 highest positive and highest negative disagreement scores derived for the visual component. It is easy to notice how all the scores are positive and achieve small values, denoting a tendency of such tags to be weakly related to the agreement label. Figure <ref type="figure" target="#fig_0">1</ref> reports an example of a meme with disagreement along with the visual representation of the EDS of its textual and visual elements. Moreover, as highlighted with a grey bar, some of the reported scores have been estimated. Such scores correspond, in fact, to constituents that are not present in the training dataset and for which it was not possible to calculate the ESD score. The visual representation of the scores related to such elements corresponds to the score obtained through the estimation strategy. Overall, it is easy to notice the presence of elements strongly related to disagreement (i.e., sexual and market), highlighted in pink.</p><p>The concept of the "sexual marketplace" is often the </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 2</head><p>Tags with the highest positive and lowest negative scores subject of debate, particularly in relation to its intersection with misogynistic ideologies <ref type="bibr" target="#b27">[28,</ref><ref type="bibr" target="#b28">29]</ref>. Some supporters, often aligned with "manosphere" or "red pill" ideologies, argue that the sexual marketplace disproportionately empowers women, giving them more control over sexual selection and relationships, which can disadvantage men. On the other hand, critics assert that this perspective reduces human relationships to transactional exchanges and objectifies both genders, ultimately reinforcing misogynistic attitudes. This last viewpoint asserts that framing relationships in market terms devalues emotional connection and perpetuates harmful stereotypes about women's worth being tied solely to their sexual desirability. Achieved results suggest the ability of the approach to detect such variety in interpretations and reflect them within the EDS scores. Figure <ref type="figure" target="#fig_1">2</ref> reports two memes that share the same text and a different image. Despite such commonalities, the memes have been labeled differently: while the first meme has been labeled as misogynous by 2 annotators out of 3, the second one has been unanimously labeled as non-misogynous. Since such memes share a common textual representation, the derived textual elements and textual-EDS are also equal, resulting in an indistinguishable representation that is ineffective for disagreement identification. Moreover, although the memes differ in the visual content, resulting in different tags and, therefore, different textual-EDS, as previously mentioned, such a component alone is not sufficient for disagreement prediction. The findings demonstrate the necessity of joint considera- All the proposed aggregation strategies have been implemented, both considering the modalities individually and jointly. Table <ref type="table" target="#tab_2">3</ref>, and Table <ref type="table">4</ref> summarise achieved results on disagreement identification considering only the score related to elements derived from the textual component (i.e., terms) and only the scores of elements derived from the visual component (i.e., tags) respectively. Table <ref type="table">5</ref> instead summarises results achieved by the aggregation of the scores derived from all the elements (i.e., terms and tags). Results achieved on the textual component only highlight G-Mean as the most performing approach. Overall, the estimation strategy results in an improvement of performances up to 6%, confirming the ability of the proposed strategy to capture disagreement relationships for unseen terms. Furthermore, BERT <ref type="bibr" target="#b29">[30]</ref>  <ref type="foot" target="#foot_2">4</ref> has been reported as a state-of-the-art baseline for unimodal textual classification. Achieved results show how BERT performs better on the majority class, struggling in predicting the disagreement class. The proposed approach, instead leads to performance more balanced among the two classes.</p><p>Table <ref type="table">4</ref> reports the performances of the different approaches for disagreement identification considering the visual component only. However, while the Sum approach (i.e., the most performing approach among the tagbased) demonstrates satisfactory performance in identifying positive instances (achieving an F1+ of 0.69), it exhibits considerable difficulty in accurately identifying negative instances.</p><p>Finally, Table <ref type="table">5</ref> reports the performances of the different approaches for disagreement identification jointly considering both modalities. Furthermore, for a better comparison of the performance achieved by the proposed approach, a state-of-the-art baseline for multimodal classification has been implemented: CLIP <ref type="bibr" target="#b30">[31]</ref> <ref type="foot" target="#foot_3">5</ref> . The inclusion of both modalities leads to a slight improvement in performances that, however, remain quite poor, highlighting the difficulty of the task. The inclusion of the unseen constituents estimation leads to an improvement of performance (except for the sum-based method) up to 8% for the mean-based approach. However, the best performances are achieved by the minimum and G-minimum approaches, for which the estimation methodology is not effective. Such behavior may be attributed to the imbalance in the dataset. The larger the number of samples with agreement, the greater the num- </p><formula xml:id="formula_5">Approach 𝜓 𝜏 F1+ F1- F1 Score Sum -<label>3</label></formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 5</head><p>Comparison of the different approaches for disagreement detection considering both textual and visual components. The agreement label (+) indicates complete annotator agreement, regardless of the misogyny value, while the agreement label (-) denotes samples without complete agreement. Bold denotes the best approach in terms of F1-score, and underline represents the best approach according to the disagreement label. 𝜓 and 𝜏 represent the best hyperparameters estimated via a greed search approach, and 𝐸 is the set of elements.</p><p>ber of agreement-related terms that impact the estimation phase. Consequently, the estimation of scores for unseen elements is likely to be positive due to the aforementioned imbalance. Overall, the findings suggest that achieving a balanced performance remains challenging.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusion and Future Works</head><p>This paper proposes a probabilistic approach to identify disagreement-related elements in multimodal content. The proposed approach allows for the identification of elements that could be used as a proxy to identify samples that might be perceived differently by the annotators, and therefore, that could lead to disagreement. Achieved results highlight the difficulty of the task, denoting the need for a more advanced approach. Future work will include different strategies for image analysis in order to provide a better description of the image itself in all the elements that compose it. Furthermore, a study of the compositionality might be carried out to better represent the relationship among such elements inside the meme. The sense of a meme is often derived from the meanings of its individual parts (i.e. the image and text) and the way they are combined. By analyzing how different elements interact and contribute to the overall message, it is possible to gain a deeper understanding of how the meaning is represented within the different modalities. This will help in identifying complex patterns and improve the accuracy of classification models.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Visual representation of disagreement scores distinguishing among textual and visual elements. Positive and negative scores are represented with green and pink respectively. The gray bar denotes elements for which the EDS has been estimated, while the white color represents elements with an EDS equal to zero.</figDesc><graphic coords="4,151.80,84.19,291.69,72.44" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Visual representation of disagreement scores distinguishing among textual and visual elements for two samples in the dataset. Positive and negative scores are represented with green and pink respectively. The white color represents elements with EDS equal to zero.</figDesc><graphic coords="5,151.80,84.19,291.68,92.46" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc></figDesc><table><row><cell>Term</cell><cell>EDS</cell><cell>Term</cell><cell>EDS</cell></row><row><cell>flu</cell><cell>1.00</cell><cell>market</cell><cell>−0.64</cell></row><row><cell>folk</cell><cell>1.00</cell><cell>fetish</cell><cell>−0.60</cell></row><row><cell>bug</cell><cell>1.00</cell><cell>nut</cell><cell>−0.57</cell></row><row><cell>Bernie</cell><cell>1.00</cell><cell>hotel</cell><cell>−0.50</cell></row><row><cell>whale</cell><cell>1.00</cell><cell cols="2">apologize −0.45</cell></row><row><cell>feeling</cell><cell>0.90</cell><cell>Miley</cell><cell>−0.45</cell></row><row><cell>gamer</cell><cell>0.87</cell><cell>lonely</cell><cell>−0.43</cell></row><row><cell>rest</cell><cell>0.87</cell><cell>award</cell><cell>−0.43</cell></row><row><cell>programmer</cell><cell>0.87</cell><cell>coke</cell><cell>−0.43</cell></row><row><cell>san</cell><cell>0.83</cell><cell>blowjob</cell><cell>−0.43</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3</head><label>3</label><figDesc>Comparison of the different approaches for disagreement detection considering the textual component only. The agreement label (+) indicates complete annotator agreement, regardless of the misogyny value, while the agreement label (-) denotes samples without complete agreement. Bold denotes the best approach in terms of F1-score, and underline represents the best approach according to the disagreement label.</figDesc><table><row><cell></cell><cell></cell><cell>.1</cell><cell>0.61 0.39</cell><cell>0.50</cell></row><row><cell>Mean</cell><cell>-</cell><cell>0.2</cell><cell>0.78 0.20</cell><cell>0.49</cell></row><row><cell>Median</cell><cell>-</cell><cell>0.2</cell><cell>0.07 0.79</cell><cell>0.43</cell></row><row><cell>Minimum</cell><cell>-</cell><cell>-0.1</cell><cell>0.29 0.75</cell><cell>0.52</cell></row><row><cell>G-Sum</cell><cell>0.8</cell><cell>3.1</cell><cell>0.65 0.37</cell><cell>0.51</cell></row><row><cell>G-Mean</cell><cell>0.8</cell><cell>0.2</cell><cell>0.73 0.34</cell><cell>0.53</cell></row><row><cell>G-Median</cell><cell>0.8</cell><cell>0.2</cell><cell>0.77 0.21</cell><cell>0.49</cell></row><row><cell>G-Minimum</cell><cell>0.8</cell><cell>-0.1</cell><cell>0.75 0.30</cell><cell>0.52</cell></row><row><cell>BERT [30]</cell><cell>-</cell><cell>-</cell><cell>0.80 0.00</cell><cell>0.40</cell></row></table><note>𝜓 and 𝜏 represent the best hyperparameters estimated via a greed search approach.</note></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_0">Although both a training and a test dataset are provided, only the training dataset is adopted, as the proposed work is focused on the analysis and prediction of disagreement and the test dataset is constructed to include only samples with complete agreement. The training dataset, instead, is characterized by 65% of data with complete agreement. Therefore, it has been divided in order to isolate the 90% for token estimation and the remaining 10% for the evaluation.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_1">Additionally, a boolean disagreement label has been derived to represent complete agreement among annotators. In particular, this last label is set to 1 if all the annotators have indicated the same label, to 0 otherwise.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_2">BERT has been implemented and finetuned using the hugging-face framework with default hyperparameters. We adopted "bert-basecased" available at https://huggingface.co/google-bert/bert-base-c ased.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_3">CLIP has been implemented and finetuned using the huggingface framework with default hyperparameters. In particular, we used the version available at https://huggingface.co/openai/clip-vit-l arge-patch14 to which we concatenated a linear layer for binary classification.</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>We acknowledge the support of the PNRR ICSC National Research Centre for High Performance Computing, Big Data and Quantum Computing (CN00000013), under the NRRP MUR program funded by the NextGenerationEU. The work of Paolo Rosso was in the framework of the FairTransNLP-Stereotypes research project (PID2021-124361OB-C31) funded by MCIN/AEI/10.13039/501100011033 and by ERDF, EU A way of making Europe.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">SemEval-2022 task 5: Multimedia automatic misogyny identification</title>
		<author>
			<persName><forename type="first">E</forename><surname>Fersini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Gasparini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Rizzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Saibene</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Chulvi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Lees</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Sorensen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022), Association for Computational Linguistics</title>
				<meeting>the 16th International Workshop on Semantic Evaluation (SemEval-2022), Association for Computational Linguistics<address><addrLine>Seattle, United States</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="533" to="549" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">How do we study misogyny in the digital age? a systematic literature review using a computational linguistic approach</title>
		<author>
			<persName><forename type="first">L</forename><surname>Fontanella</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Chulvi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Ignazzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sarra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Tontodimamma</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Humanities and Social Sciences Communications</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="page" from="1" to="15" />
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Handling disagreement in hate speech modelling</title>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">Kralj</forename><surname>Novak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Scantamburlo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Pelicon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Cinelli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Mozetič</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Zollo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="681" to="695" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">GRaSP: A multilayered annotation scheme for perspectives</title>
		<author>
			<persName><forename type="first">C</forename><surname>Van Son</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Caselli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Fokkens</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Maks</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Morante</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Aroyo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Vossen</surname></persName>
		</author>
		<ptr target="https://aclanthology.org/L16-1187" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC&apos;16), European Language Resources Association (ELRA)</title>
				<editor>
			<persName><forename type="first">N</forename><surname>Calzolari</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K</forename><surname>Choukri</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>Declerck</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Goggi</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Grobelnik</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">B</forename><surname>Maegaard</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Mariani</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Mazo</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Moreno</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Odijk</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Piperidis</surname></persName>
		</editor>
		<meeting>the Tenth International Conference on Language Resources and Evaluation (LREC&apos;16), European Language Resources Association (ELRA)<address><addrLine>Portorož, Slovenia</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="1177" to="1184" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Perspectivist approaches to natural language processing: a survey</title>
		<author>
			<persName><forename type="first">S</forename><surname>Frenda</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Abercrombie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Basile</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Pedrani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Panizzon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">T</forename><surname>Cignarella</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Marco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Bernardi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Language Resources and Evaluation</title>
				<imprint>
			<date type="published" when="2024">2024</date>
			<biblScope unit="page" from="1" to="28" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Learning from disagreement: A survey</title>
		<author>
			<persName><forename type="first">A</forename><surname>Uma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Fornaciari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Hovy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Paun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Plank</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Poesio</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Artificial Intelligence Research</title>
		<imprint>
			<biblScope unit="volume">72</biblScope>
			<biblScope unit="page" from="1385" to="1470" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">From annotator agreement to noise models</title>
		<author>
			<persName><forename type="first">B</forename><surname>Beigman Klebanov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Beigman</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computational Linguistics</title>
		<imprint>
			<biblScope unit="volume">35</biblScope>
			<biblScope unit="page" from="495" to="503" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">The origin and value of disagreement among data labelers: A case study of individual differences in hate speech annotation</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Sang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Stanton</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Information for a Better World: Shaping the Global Future: 17th International Conference, iConference 2022, Virtual Event</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2022-03-04">February 28-March 4, 2022. 2022</date>
			<biblScope unit="page" from="425" to="444" />
		</imprint>
	</monogr>
	<note>Proceedings, Part I</note>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">A crowdsourced frame disambiguation corpus with ambiguity</title>
		<author>
			<persName><forename type="first">A</forename><surname>Dumitrache</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Mediagroep</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Aroyo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Welty</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of NAACL-HLT</title>
				<meeting>NAACL-HLT</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="2164" to="2170" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Beyond black &amp; white: Leveraging annotator disagreement via soft-label multi-task learning</title>
		<author>
			<persName><forename type="first">T</forename><surname>Fornaciari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Uma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Paun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Plank</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Hovy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Poesio</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics</title>
				<meeting>the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Crowd worker strategies in relevance judgment tasks</title>
		<author>
			<persName><forename type="first">L</forename><surname>Han</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Maddalena</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Checco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Sarasua</surname></persName>
		</author>
		<author>
			<persName><forename type="first">U</forename><surname>Gadiraju</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Roitero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Demartini</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 13th international conference on web search and data mining</title>
				<meeting>the 13th international conference on web search and data mining</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="241" to="249" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Why don&apos;t you do it right? analysing annotators&apos; disagreement in subjective tasks</title>
		<author>
			<persName><forename type="first">M</forename><surname>Sandri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Leonardelli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Tonelli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Ježek</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics</title>
				<meeting>the 17th Conference of the European Chapter of the Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="2428" to="2441" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Shahriar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Solorio</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2305.01050</idno>
		<title level="m">Safewebuh at semeval-2023 task 11: Learning annotator disagreement in derogatory text: Comparison of direct training vs aggregation</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">eevvgg at SemEval-2023 task 11: Offensive language classification with rater-based information</title>
		<author>
			<persName><forename type="first">E</forename><surname>Gajewska</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2023.semeval-1.24</idno>
		<ptr target="https://aclanthology.org/2023.semeval-1.24.doi:10.18653/v1/2023.semeval-1.24" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), Association for Computational Linguistics</title>
				<editor>
			<persName><forename type="first">A</forename><forename type="middle">K</forename><surname>Ojha</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><forename type="middle">S</forename><surname>Doğruöz</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><surname>Da San Martino</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Tayyar</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Madabushi</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">E</forename><surname>Kumar</surname></persName>
		</editor>
		<editor>
			<persName><surname>Sartori</surname></persName>
		</editor>
		<meeting>the 17th International Workshop on Semantic Evaluation (SemEval-2023), Association for Computational Linguistics<address><addrLine>Toronto, Canada</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="171" to="176" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">University at buffalo at semeval-2023 task 11: Masda-modelling annotator sensibilities through disaggregation</title>
		<author>
			<persName><forename type="first">M</forename><surname>Sullivan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Yasin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">L</forename><surname>Jacobs</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 17th International Workshop on Semantic Evaluation</title>
				<meeting>the 17th International Workshop on Semantic Evaluation<address><addrLine>SemEval-</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">2023. 2023</date>
			<biblScope unit="page" from="978" to="985" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Ai-upv at exist 2023-sexism characterization using large language models under the learning with disagreements regime</title>
		<author>
			<persName><forename type="first">A</forename><surname>De Paula</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Rizzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Fersini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Spina</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CEUR WORKSHOP PRO-CEEDINGS</title>
				<imprint>
			<publisher>CEUR-WS</publisher>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">3497</biblScope>
			<biblScope unit="page" from="985" to="999" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">When multiple perspectives and an optimization process lead to better performance, an automatic sexism identification on social media with pretrained transformers in a soft label context</title>
		<author>
			<persName><forename type="first">J</forename><surname>Erbani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Egyed-Zsigmond</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Nurbakova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P.-E</forename><surname>Portier</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Working Notes of CLEF</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">E</forename><surname>Vallecillo-Rodríguez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Del Arco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">A</forename><surname>Ureña-López</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">T</forename><surname>Martín-Valdivia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Montejo-Ráez</surname></persName>
		</author>
		<title level="m">Integrating annotator information in transformer finetuning for sexism detection</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note>Working Notes of CLEF</note>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Perspectives on hate: General vs. domain-specific models</title>
		<author>
			<persName><forename type="first">G</forename><surname>Rizzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Fontana</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Fersini</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 3rd Workshop on Perspectivist Approaches to NLP (NLPerspectives)@ LREC-COLING 2024</title>
				<meeting>the 3rd Workshop on Perspectivist Approaches to NLP (NLPerspectives)@ LREC-COLING 2024</meeting>
		<imprint>
			<date type="published" when="2024">2024</date>
			<biblScope unit="page" from="78" to="83" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Change my mind: How syntax-based hate speech recognizer can uncover hidden motivations based on different viewpoints</title>
		<author>
			<persName><forename type="first">M</forename><surname>Michele</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Basile</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">M</forename><surname>Zanzotto</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">1st Workshop on Perspectivist Approaches to Disagreement in NLP, NLPerspectives 2022 as part of Language Resources and Evaluation Conference, LREC 2022 Workshop, European Language Resources Association (ELRA)</title>
				<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="117" to="125" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Beyond explanation: A case for exploratory text visualizations of non-aggregated, annotated datasets</title>
		<author>
			<persName><forename type="first">L</forename><surname>Havens</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Bach</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Terras</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Alex</surname></persName>
		</author>
		<ptr target="https://aclanthology.org/2022.nlperspectives-1.10" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 1st Workshop on Perspectivist Approaches to NLP @LREC2022, European Language Resources Association</title>
				<editor>
			<persName><forename type="first">G</forename><surname>Abercrombie</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">V</forename><surname>Basile</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Tonelli</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">V</forename><surname>Rieser</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Uma</surname></persName>
		</editor>
		<meeting>the 1st Workshop on Perspectivist Approaches to NLP @LREC2022, European Language Resources Association<address><addrLine>Marseille, France</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="73" to="82" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Integrated gradients as proxy of disagreement in hateful content</title>
		<author>
			<persName><forename type="first">A</forename><surname>Astorino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Rizzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Fersini</surname></persName>
		</author>
		<ptr target="CEUR-WS.org" />
	</analytic>
	<monogr>
		<title level="m">CEUR WORKSHOP PROCEEDINGS</title>
				<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">3596</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Unraveling disagreement constituents in hateful speech</title>
		<author>
			<persName><forename type="first">G</forename><surname>Rizzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Astorino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Fersini</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">European Conference on Information Retrieval</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2024">2024</date>
			<biblScope unit="page" from="21" to="29" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Recognizing misogynous memes: Biased models and tricky archetypes</title>
		<author>
			<persName><forename type="first">G</forename><surname>Rizzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Gasparini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Saibene</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Fersini</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information Processing &amp; Management</title>
		<imprint>
			<biblScope unit="volume">60</biblScope>
			<biblScope unit="page">103474</biblScope>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<monogr>
		<author>
			<persName><surname>Clarifai</surname></persName>
		</author>
		<ptr target="https://docs.clarifai.com/" />
		<title level="m">Clarifai guide</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Bert: Pretraining of deep bidirectional transformers for language understanding</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">D</forename></persName>
		</author>
		<author>
			<persName><forename type="first">M.-W</forename><forename type="middle">C</forename><surname>Kenton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">K</forename><surname>Toutanova</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of NAACL-HLT</title>
				<meeting>NAACL-HLT</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="4171" to="4186" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Measuring nominal scale agreement among many raters</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">L</forename><surname>Fleiss</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Psychological bulletin</title>
		<imprint>
			<biblScope unit="volume">76</biblScope>
			<biblScope unit="page">378</biblScope>
			<date type="published" when="1971">1971</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<monogr>
		<author>
			<persName><forename type="first">D</forename><surname>Ging</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Neary</surname></persName>
		</author>
		<title level="m">Gender, sexuality, and bullying special issue editorial</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Exploring misogyny through time: From historical origins to modern complexities</title>
		<author>
			<persName><forename type="first">E</forename><surname>Ignazzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sarra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Fontanella</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Philosophies of Communication</title>
		<imprint>
			<biblScope unit="page" from="195" to="214" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">Bert: Pretraining of deep bidirectional transformers for language understanding</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">D</forename></persName>
		</author>
		<author>
			<persName><forename type="first">M.-W</forename><forename type="middle">C</forename><surname>Kenton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">K</forename><surname>Toutanova</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of NAACL-HLT</title>
				<meeting>NAACL-HLT</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="4171" to="4186" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<analytic>
		<title level="a" type="main">Learning transferable visual models from natural language supervision</title>
		<author>
			<persName><forename type="first">A</forename><surname>Radford</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">W</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Hallacy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ramesh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Goh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Agarwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Sastry</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Askell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Mishkin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Clark</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International conference on machine learning</title>
				<meeting><address><addrLine>PMLR</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="8748" to="8763" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
