<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Multimodal Online Manipulation: Empirical Analysis of Fact-Checking Reports</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Olga</forename><surname>Uryupina</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of Information Engineering and Computer Science</orgName>
								<orgName type="institution">University of Trento</orgName>
							</affiliation>
						</author>
						<title level="a" type="main">Multimodal Online Manipulation: Empirical Analysis of Fact-Checking Reports</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">4C0B2F7CEBEFAB97BFD22D26732A3571</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T17:37+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>fact checking</term>
					<term>multi modal</term>
					<term>annotation</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This paper presents an in-depth exploratory quantitative study of the interaction between multimedia and textual components in online manipulative content. We discuss relations between content layers (such as proof or support) as well as unscrupulous techniques compromising visual content. The study is based on fakes reported and analyzed by PolitiFact and comprises documents from Facebook, Twitter and Instagram. We identify several pervasive phenomena currently, affecting the impact of manipulative content on the reader and the possible strategies for effective de-bunking actions, and discuss possible research directions.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Manipulative online content (fake news, propaganda, among others) is growing at an alarming rate, hindering our access to truthful and unbiased information and thus threatening principles of the democratic society. The problem has been addressed by professional journalists, who -with the help of crowd-workers -fight a never-ending battle to prevent information contamination. To enable a large-scale response to the misinformation threat, the AI community has invested a considerable effort into building competitive models for identifying non-transparent content, such as false claims or altered videos (deep fakes). However, we still lack a thorough understanding of the manipulative content and multiple aspects affecting its perception and impact on the reader. This paper aims at an in-depth analysis of one of such aspects, namely, the interaction between different (multimedia) layers of the manipulative message. More specifically, we study the semantics underlying the relation between multimedia and textual parts of the fake news. Our study is based on around 800 fakes from January till September 2022, as identified and analysed by PolitiFact. 1  Multimedia content, such as videos, reels, photos, screenshots or images is becoming increasingly popular in social media: it is an appealing and powerful way of expressing and/or enhancing one's message. Nevertheless, as a scientific community, we still have little understanding of the way the authors integrate multimedia into their content: most research so far has focused on a specific component and not on their interplay. Our study aims at identifying the role of multimedia part of manipulative messages.</p><p>Figure <ref type="figure">1</ref> shows some examples from potential fakes analyzed by PolitiFact. We observe different relations between the text and the image. In particular, in (1a), the video is supposed to prove the claim by providing direct evidence, whereas in (1b), the image provides a support (appeal to authority). In (1c), the image is a visual paraphrase of the claim, enhancing its appeal but not providing extra proof, support or informational material. Finally, in (1d), the photo is an illustration that, while depicting the discussed person, does not aim at being relevant to the claim's veracity or impact. While understanding the relation between the image and the text is interesting from the scientific perspective, it is also a crucial prerequisite for efficient and meaningful fact-checking response. For example, if a supposed proof is a compromised photo, the response should highlight this fact (e.g., the video in (1a) has been cropped misrepresenting the quote, which should be highlighted in the fact-checking report). On the contrary, if a compromised photo is used as a mere illustration, the effective fact-checking report should focus on the textual claim per se.</p><p>Another important angle is the issue with the multimedia part. In our example, the video in (1a) is cropped. On the contrary, (1b) represents an authentic screenshot, yet, it has been miscaptioned by the claim: an older content, irrelevant for the current events/topics, has been repurposed.</p><p>The current paper focuses on these two aspects to analyze empirically the interplay between multimedia and textual components in fake news, as identified by Politi- Fact. To this end, we reannotate the PolyFake dataset <ref type="bibr" target="#b0">[1]</ref> with fine-grained labels reflecting multimedia aspects.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>While fact checking has been receiving an increasing amount of attention recently both from NLP and Vision communities, only very few studies focus on the interaction between different modalities. A breakthrough approach by Vempala and Preoţiuc-Pietro <ref type="bibr" target="#b1">[2]</ref> focuses on two dimensions of the relationship between text and image on Twitter: whether the text is represented in the image and whether the image adds extra content to the textual message. Cheema et al. <ref type="bibr" target="#b2">[3]</ref> propose a dataset of multimodal tweets, annotated for visual relevancy and checkworthiness. Finally, Biamby et al. <ref type="bibr" target="#b3">[4]</ref> propose a larger-scale dataset of multimodal tweets, where "falsified" claims have been added synthetically to address the image repurposing problem.</p><p>These studies have paved the way for evaluation campaigns and benchmarking resources, for example, <ref type="bibr" target="#b4">[5]</ref>. Yet, these studies rely on rather straightforward annotation guidelines to reduce the per-claim cost. Moreover, the annotators are not professional fact-checkers: while they can assess some aspects of the compromised content, they still can get deceived by more challenging casesafter all, the manipulative content has been created on Types of layered content.</p><p>purpose to influence and bias the reader.</p><p>In a recent survey, Mubashara et al. <ref type="bibr" target="#b5">[6]</ref> highlight the importance of an interdisciplinary approach to factchecking, proposing a framework to model different axes of online manipulation, most importantly, fusing the textual and visual fact-checking and survey benchmarks and models developed by respective communities. Our study is built upon the same motivation -and our main goal is to study empirically the interplay between different modalities, based on real-world (i.e., not simulated or synthesized) fakes data.</p><p>Our study aims at an in-depth exploratory analysis of the multimodal online content. To this end, we focus on more specific labels to describe the relationship between different layers/modalities. We extend the scope of our study to cover all the three major platforms (Facebook, Instagram and Twitter). Moreover, our input is not only the claim per se, but the professionally created fact-checking report from PolitiFact. In our experience, PolitiFact reports contain a wealth of information about online manipulation: as opposed to 2-3 binary labels of common NLP fact-checking benchmarks, PolitiFact characterizes each claim with 1-3 pages of analytics. This analytics, however, comes in a free textual form. While it might be still impossible for the NLP community to encode these reports for building high-quality fact-checking systems, we believe that we should at least learn from them to get better insights, stop trivializing the task and highlight understudied, yet impactful, subtasks.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Analyzing Multimedia Content</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">PolyFake</head><p>Our study is based on the PolyFake dataset <ref type="bibr" target="#b0">[1]</ref> covering fake news from 2022, as analyzed by professional factcheckers from the PolitiFact agency. 2 The current study 2 PolyFake annotation guidelines cover a wide range of phenomena related to online manipulation: from fallacious/propaganda reasoning to emotive appeals, factual veracity etc. Current study aims at an in-depth analysis of a specific angle. The Appendix discusses the distribution of veracity labels across PolyFake documents.</p><p>is based on the first nine months of PolyFake (818 entries). Each entry has been re-assessed by two annotators, with further adjudication by the supervisor. The original PolyFake labels are binary and encode more generic properties of fake news (e.g. whether the reasoning is fallacious or whether the document triggers emotions).</p><p>For the present study, we have designed and iteratively refined annotation guidelines for labelling multimedia aspects of manipulative content.</p><p>The annotation process is based on consulting jointly not only the original content, but the PolitiFact report as well. This way we make use of the wealth of analytics provided by experienced professional fact-checkers by encoding it in more structured annotation labels.</p><p>PolyFake covers fakes from different social media (Twitter, Facebook, Instagram, TikTok, Threads and YouTube). Note that manipulative content often gets propagated across platforms through re-posts, sharing, linking or just copying. For example, a large proportion of Facebook videos originates from TikTok (in this case, PolitiFact typically analyzes the Facebook message, hence a low number of TikTok entities in the table). In the following study, we omit TikTok, YouTube and Telegram as largely underrepresented categories with rather straightforward patterns.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Multimedia and Layered Content</head><p>Layer Types. Table <ref type="table" target="#tab_0">1</ref> shows the distribution of different media types for each platform. We have identified several types of layered content: parts of the message rendered together with the initial post. The most common ones are videos (including reels), photos and screenshots (typically, complex visual objects combining textual content with photos/images and referring the reader to a different source). We have also observed images (infographics, maps or drawings), links (this content typically is rendered with a photo/stillshot, yet it explicitly points to a different online location, for example, promotion website) or threads (characteristic for Twitter, this type of layering helps to contextualize the message). On rare occasions, social media posts might contain more than one extra layer (e.g., videos and photos). Most importanly, only 18% of PolyFake documents are purely textual: adhering to the popular adage that a picture is worth a thousand words, manipulative content creators use visuals for a variety of purposes, from increasing the outreach to improving the credibility. Moreover, the prevalence of multimedia content is way more critical for Facebook and Instagram -the two platforms not typically addressed by NLP practitioners. This alone suggests that we need to pay much more attention to joint models and start with deeper understanding of relevant phenomena.</p><p>A large percentage of documents are re-using or spreading already existing information. This is true for screenshots (21% in total) and links (5%), but also for many videos -only very few videos represent original content. While there exist some studies on identifying previously fact-checked claims, they are restricted to the textual content. We believe that a more complex multimodal approach would be beneficial here.</p><p>For presentation issues, in what follows we merge our underrepresented categories link, image and thread with roughly functionally similar major categories screenshot, photo and screenshot respectively.</p><p>Layer Roles. Table <ref type="table" target="#tab_1">2</ref> shows different roles multime-dia levels play in PolyFake documents. We distinguish between the following roles: content (the essential part of the content is presented on the multimedia layer, whereas the textual layer just adds minor details or suggests opinions), proof (the multimedia layer offers a physical proofcf. Example (1a)), support (the multimedia layer provides some material to support the claim, from a reputable source -cf. Example (1b)), paraphrase (the multimedia layer paraphrases the claim without adding any extra angle -cf. Example (1c)), context (while the textual claim is generally self-contained, it cannot be interpreted without the context given by the multimedia part (e.g., the claim contains pronouns and the image presents their referents)), illustration (the multimedia layer shows some objects/persons mentioned in the claim without any connection to its semantics -cf. Example (1d)) and action (the multimedia layer suggests an appropriate reaction to the claim, for example, a scam website). Finally, a rather common role for videos and photos is anchor: in such cases, the textual claim is about the multimedia itself (for example, "the sharpest image of the sun ever recorded. "; here, the multimedia is not compromised per se and the textual claim contains no falsehoods about the world, yet the combination might be very misleading.</p><p>In more than half of the documents, multimodal layers provide essential content. This is true for all the media types (videos, photos and screenshots). We have observed several possible factors contributing to this effect: in general, social media users tend to repost existing "fancy" content and not create their own texts. Even in authentic self-created posts, the message is often put in a visual, whereas only some emotions are added in a text. We believe that there is a wide variety of potential reasons for this behaviour (e.g., videos and photos get more likes, whereas texts are mostly ignored by peers), requiring a more specialized study. Almost one third of multimedia layers, especially videos, supposedly present proofs. Such compromised proofs are out of reach for the modern evidence-based automatic fact-checking: while a fact-checking model can provide extensive evidence to refute a claim, the user would still trust the video/photo and not the model. Human fact-checkers address such proofs from a different, more promising, perspective: they try to explicitly attack and debunk the proof. We believe that this is a very important and largely unaddressed research direction.</p><p>Issues with multimedia layers. Finally, we have identified the most common unscrupulous techniques relevant for multimedia layers. Those include: crop (essential part(s) of the original message are omitted to render it out of context -cf. Example (1a)); miscaption (while the image/video is authentic, the textual claim misleads w.r.t. some crucial details, e.g. events or timeline -cf. Example (1b)); altered/fake (the image/video has been altered -beyond cropping -with the specialized software, including deep fakes); misperception (the image/video isdeliberately or not -deceiving because of its low quality, unclear angle, optical effects etc); noproof (the -typically long -video does not contain any components relevant for the claim); falsehood (the video/image is authentic, yet its content is untrue -i.e., the textual claim spreads the original fake generated by the video/image); and explain (the textual part explains -misleadingly -what we are supposed to see in the video, often of a rather low quality).</p><p>Table <ref type="table" target="#tab_2">3</ref> summarizes the distribution of problematic issues across the three main multimedia types, showing several trends. First, video layers provide more possibilities for unscrupulous content generators: cropped, otherwise altered or low quality videos are pervasive in manipulative content. While most of the research focuses on images, they do not exhibit such a variety of manipulative strategies. Screenshots -authentic or fake -are largely used to disseminate falsehoods. At the same time, an increasing amount of authentic videos, mostly originating from TikTok, is created to spread falsehoods and promote "critical thinking" (i.e., conspiracy theories as opposed to rational argumentation). These remain largely understudied, despite their large impact on the audience. Another rather unstudied area are ex-planatory claims: authentic videos/photos accompanied by misleading explanations of what we see and what it means; in such cases, the factual component might be non-compromised, yet the biased explanation makes the whole message an impactful and hard to debunk propaganda tool. Finally, unlike videos and screenshots, most photos represent true authentic information -the textual claims either rely on them as illustrations or use them as building blocks to support fallacious argumentation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Conclusion</head><p>We have presented an in-depth analysis of the interaction between textual and multimedia components of compromised social media documents. We have identified several high-impact issues, insufficiently studied by the community at the moment. These include the interaction between different modalities, the role of the multimedia part and its impact on selecting the successful factchecking strategy, the difference between platforms and media types (current NLP studies predominantly focus on Twitter and images) and the importance of a more principled approach to content re-use. We hope that this study, motivated by human fact-checking expertise, can sparkle a meaningful discussion and improve automatic modeling.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. True vs. Fake content and multimedia layers</head><p>Our dataset by construction contains mostly untrue claims: even though PolitiFact occasionally fact-checks statements that turn out to be true, most of their materials are "false", "mostly false" or even "pants on fire". Moreover, even true claims often exhibit signs of user manipulation. In this appendix, we show statistics for fake vs. true content in PolitiFact reports (Table <ref type="table">4</ref>).</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head></head><label></label><figDesc>CLiC-it 2024: Tenth Italian Conference on Computational Linguistics, Dec 04 -06, 2024, Pisa, Italy uryupina@gmail.com (O. Uryupina)1 PolitiFact (https://www.politifact.com/) is an independent journalistic agency and one of the most experienced fact-checking organizations, providing detailed analytics for non-transparent online content since 2007.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>( a )Figure 1 :</head><label>a1</label><figDesc>Figure 1: Different uses of layered/multimedia content</figDesc><graphic coords="2,112.36,302.51,170.08,144.11" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc></figDesc><table><row><cell>Layer</cell><cell cols="2">Facebook</cell><cell cols="2">Twitter</cell><cell cols="2">Instagram</cell><cell>TikTok</cell><cell></cell><cell cols="2">YouTube</cell><cell></cell><cell>Total</cell></row><row><cell>none</cell><cell cols="2">64 12.7%</cell><cell cols="2">80 41.9%</cell><cell>4</cell><cell>3.9%</cell><cell>-</cell><cell>-</cell><cell>-</cell><cell>-</cell><cell cols="2">149 18.2%</cell></row><row><cell>video</cell><cell cols="2">195 38.6%</cell><cell cols="2">25 13.1%</cell><cell cols="2">40 38.9%</cell><cell cols="2">11 100%</cell><cell cols="2">6 100%</cell><cell cols="2">277 33.9%</cell></row><row><cell>photo</cell><cell cols="2">92 18.8%</cell><cell cols="2">31 16.2%</cell><cell>10</cell><cell>9.7%</cell><cell>-</cell><cell>-</cell><cell>-</cell><cell>-</cell><cell cols="2">133 16.3%</cell></row><row><cell>screenshot</cell><cell cols="2">114 22.5%</cell><cell>19</cell><cell>9.9%</cell><cell cols="2">45 43.7%</cell><cell>-</cell><cell>-</cell><cell>-</cell><cell>-</cell><cell cols="2">178 21.8%</cell></row><row><cell>link</cell><cell>29</cell><cell>5.7%</cell><cell>15</cell><cell>7.8%</cell><cell>-</cell><cell>-</cell><cell>-</cell><cell>-</cell><cell></cell><cell>-</cell><cell>44</cell><cell>5.4%</cell></row><row><cell>image</cell><cell>14</cell><cell>2.8%</cell><cell cols="2">6 3.14%</cell><cell>6</cell><cell>5.8%</cell><cell>-</cell><cell>-</cell><cell>-</cell><cell>-</cell><cell>26</cell><cell>3.2%</cell></row><row><cell>thread</cell><cell>-</cell><cell>-</cell><cell>17</cell><cell>8.9%</cell><cell>-</cell><cell>-</cell><cell>-</cell><cell>-</cell><cell>-</cell><cell>-</cell><cell>17</cell><cell>2.1%</cell></row><row><cell>total</cell><cell>506</cell><cell>100%</cell><cell>191</cell><cell>100%</cell><cell>103</cell><cell>100%</cell><cell cols="2">11 100%</cell><cell cols="2">6 100%</cell><cell>818</cell><cell>100%</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>Role of mulimedia layers, per content type (photo+ includes photos and images, screenshot+ includes screenshots, links and threads/retweets), purely textual documents discarded.</figDesc><table><row><cell>role</cell><cell>video</cell><cell></cell><cell cols="2">photo+</cell><cell cols="2">screensh.+</cell></row><row><cell></cell><cell>total</cell><cell>%</cell><cell>total</cell><cell>%</cell><cell>total</cell><cell>%</cell></row><row><cell>content</cell><cell cols="2">66 23.8</cell><cell cols="2">19 12.0</cell><cell cols="2">114 48.1</cell></row><row><cell>anchor</cell><cell cols="2">62 22.4</cell><cell cols="2">46 29.1</cell><cell>16</cell><cell>6.8</cell></row><row><cell>proof</cell><cell cols="2">86 31.0</cell><cell cols="2">36 22.8</cell><cell cols="2">39 16.5</cell></row><row><cell>support</cell><cell>14</cell><cell>5.1</cell><cell>4</cell><cell>2.5</cell><cell>16</cell><cell>6.8</cell></row><row><cell>paraphr.</cell><cell cols="2">30 10.8</cell><cell>6</cell><cell>3.8</cell><cell>23</cell><cell>9.7</cell></row><row><cell>context</cell><cell>8</cell><cell>2.9</cell><cell>3</cell><cell>1.9</cell><cell>21</cell><cell>8.9</cell></row><row><cell>illustr.</cell><cell>1</cell><cell>0.4</cell><cell cols="2">55 34.8</cell><cell>9</cell><cell>3.8</cell></row><row><cell>action</cell><cell>3</cell><cell>1.1</cell><cell>1</cell><cell>0.6</cell><cell>14</cell><cell>5.9</cell></row><row><cell>other</cell><cell cols="2">28 10.1</cell><cell>-</cell><cell>-</cell><cell cols="2">2 0.84</cell></row><row><cell>total</cell><cell>277</cell><cell></cell><cell>158</cell><cell></cell><cell>237</cell><cell></cell></row><row><cell>Issue</cell><cell cols="2">video</cell><cell cols="2">photo+</cell><cell cols="2">screenshot+</cell></row><row><cell>falsehood</cell><cell cols="2">93 33.6%</cell><cell cols="2">16 10.12%</cell><cell>130</cell><cell>54.9%</cell></row><row><cell>crop</cell><cell>12</cell><cell>4.3%</cell><cell>-</cell><cell>-</cell><cell>1</cell><cell>0.4%</cell></row><row><cell>miscaption</cell><cell cols="2">60 21.7%</cell><cell>47</cell><cell>29.7%</cell><cell>15</cell><cell>6.3%</cell></row><row><cell>altered/fake</cell><cell>17</cell><cell>6.1%</cell><cell>15</cell><cell>9.5%</cell><cell>29</cell><cell>12.2%</cell></row><row><cell>misperception</cell><cell>7</cell><cell>2.5%</cell><cell>5</cell><cell>3.2%</cell><cell>-</cell><cell>-</cell></row><row><cell>noproof</cell><cell>27</cell><cell>9.7%</cell><cell>3</cell><cell>1.9%</cell><cell>5</cell><cell>2.1%</cell></row><row><cell>explain</cell><cell>26</cell><cell>9.4%</cell><cell>6</cell><cell>3.8%</cell><cell>12</cell><cell>5.1%</cell></row><row><cell>none</cell><cell>13</cell><cell>4.7%</cell><cell>58</cell><cell>36.7%</cell><cell>43</cell><cell>18.1%</cell></row><row><cell></cell><cell>277</cell><cell></cell><cell>158</cell><cell></cell><cell>237</cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3</head><label>3</label><figDesc>Types of manipulative content for different multimedia layers.</figDesc><table /></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>We thank the Autonomous Province of Trento for the financial support of our project via the AI@TN initiative.</p></div>
			</div>

			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>FC label</head><note type="other">Facebook</note></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m">Anonymous, PolyFake: Fine-grained multiperspective annotation of fact-checking reports</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
	<note>Accepted for publication</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Categorizing and inferring the relationship between the text and image of Twitter posts</title>
		<author>
			<persName><forename type="first">A</forename><surname>Vempala</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Preoţiuc-Pietro</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/P19-1272</idno>
		<ptr target="https://aclanthology.org/P19-1272.doi:10.18653/v1/P19-1272" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics</title>
				<meeting>the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics<address><addrLine>Florence, Italy</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="2830" to="2840" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">MM-claims: A dataset for multimodal claim detection in social media</title>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">S</forename><surname>Cheema</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Hakimov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sittar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Müller-Budack</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Otto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Ewerth</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2022.findings-naacl.72</idno>
		<ptr target="https://aclanthology.org/2022.findings-naacl.72.doi:10.18653/v1/2022.findings-naacl.72" />
	</analytic>
	<monogr>
		<title level="m">Findings of the Association for Computational Linguistics: NAACL 2022, Association for Computational Linguistics</title>
				<meeting><address><addrLine>Seattle, United States</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="962" to="979" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Twitter-COMMs: Detecting climate, COVID, and military multimodal misinformation</title>
		<author>
			<persName><forename type="first">G</forename><surname>Biamby</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Luo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Darrell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rohrbach</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2022.naacl-main.110</idno>
		<ptr target="https://aclanthology.org/2022.naacl-main.110.doi:10.18653/v1/2022.naacl-main.110" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics</title>
				<meeting>the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics<address><addrLine>Seattle, United States</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="1530" to="1549" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Dataset for multimodal fake news detection and verification tasks</title>
		<author>
			<persName><forename type="first">A</forename><surname>Bondielli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Dell'oglio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Lenci</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Marcelloni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Passaro</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.dib.2024.110440</idno>
		<ptr target="https://doi.org/10.1016/j.dib.2024.110440" />
	</analytic>
	<monogr>
		<title level="j">Data in Brief</title>
		<imprint>
			<biblScope unit="volume">54</biblScope>
			<biblScope unit="page">110440</biblScope>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><surname>Mubashara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Michael</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Zhijiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Oana</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Elena</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Andreas</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2305.13507</idno>
		<title level="m">Multimodal automated factchecking: A survey</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
