<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Cognitive Mirage: A Review of Hallucinations in Large Language Models ⋆</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Hongbin</forename><surname>Ye</surname></persName>
							<email>yehongbin@zhejianglab.com</email>
							<affiliation key="aff0">
								<orgName type="department">Zhejiang Lab</orgName>
								<address>
									<addrLine>No. 1 Kechuang Avenue, Yuhang District</addrLine>
									<settlement>Hangzhou City, Zhejiang Province</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Tong</forename><surname>Liu</surname></persName>
							<email>liutong@zhejianglab.com</email>
							<affiliation key="aff0">
								<orgName type="department">Zhejiang Lab</orgName>
								<address>
									<addrLine>No. 1 Kechuang Avenue, Yuhang District</addrLine>
									<settlement>Hangzhou City, Zhejiang Province</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Aijia</forename><surname>Zhang</surname></persName>
							<email>zhangaijia@zhejianglab.com</email>
							<affiliation key="aff0">
								<orgName type="department">Zhejiang Lab</orgName>
								<address>
									<addrLine>No. 1 Kechuang Avenue, Yuhang District</addrLine>
									<settlement>Hangzhou City, Zhejiang Province</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Wei</forename><surname>Hua</surname></persName>
							<email>huawei@zhejianglab.com</email>
							<affiliation key="aff0">
								<orgName type="department">Zhejiang Lab</orgName>
								<address>
									<addrLine>No. 1 Kechuang Avenue, Yuhang District</addrLine>
									<settlement>Hangzhou City, Zhejiang Province</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Weiqiang</forename><surname>Jia</surname></persName>
							<email>jiaweiqiang@zhejianglab.com</email>
							<affiliation key="aff0">
								<orgName type="department">Zhejiang Lab</orgName>
								<address>
									<addrLine>No. 1 Kechuang Avenue, Yuhang District</addrLine>
									<settlement>Hangzhou City, Zhejiang Province</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Cognitive Mirage: A Review of Hallucinations in Large Language Models ⋆</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">201125FA1AC698B6DDF803613F0888C0</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T19:28+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Taxonomy of Hallucination</term>
					<term>Large Language Models</term>
					<term>Hallucination Detection</term>
					<term>Hallucination Correction Hallucination Detection Inference Classifier FIB [62]</term>
					<term>ExHalder [64]</term>
					<term>HaluEval [31]</term>
					<term>GAVIE [67]</term>
					<term>Fact-checking [68]</term>
					<term>CoNLI [69] Uncertainty Metric BARTScore [70]</term>
					<term>KoK [71]</term>
					<term>SLAG [72]</term>
					<term>KLD [73]</term>
					<term>POLAR [74]</term>
					<term>ASTSN [75] Self-Evaluation LM-know [76]</term>
					<term>SelfCheckGPT [77]</term>
					<term>Do-LLM-Know [78]</term>
					<term>EOH [41]</term>
					<term>Self-Checker [79]</term>
					<term>LM-vs-LM [80]</term>
					<term>SelfCk [81]</term>
					<term>RV [82] Evidence Retrieval FActScore [83]</term>
					<term>CCV [84]</term>
					<term>RSE [85]</term>
					<term>FacTool [86]</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>As large language models continue to develop in the field of AI, text generation systems are susceptible to a worrisome phenomenon known as hallucination. In this study, we summarize recent compelling insights into hallucinations in LLMs. We present a novel taxonomy of hallucinations from various text generation tasks, thus provideing theoretical insights, detection methods and improvement approaches. Based on this, future research directions are proposed. Our contributions are threefold: (1) We provide a complete taxonomy for hallucinations appearing in text generation tasks; (2) We provide theoretical analyses of hallucinations in LLMs and provide existing detection and improvement methods; (3) We propose several research directions that can be developed in the future. Our literature library is available at https://github.com/hongbinye/Cognitive-Mirage-Hallucinations-in-LLMs.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>In the ever-evolving realm of large language models (LLMs), a constellation of innovative creations has emerged, such as GPT-3 <ref type="bibr" target="#b0">[1]</ref>, InstructGPT <ref type="bibr" target="#b1">[2]</ref>, FLAN <ref type="bibr" target="#b2">[3]</ref>, PaLM <ref type="bibr" target="#b3">[4]</ref>, LLaMA <ref type="bibr" target="#b4">[5]</ref> and other notable contributors <ref type="bibr" target="#b5">[6,</ref><ref type="bibr" target="#b6">7,</ref><ref type="bibr" target="#b7">8,</ref><ref type="bibr" target="#b8">9]</ref>. These models implicitly encode global knowledge within their parameters during the pre-training phase <ref type="bibr" target="#b9">[10,</ref><ref type="bibr" target="#b10">11]</ref>, offering valuable insights as knowledge repositories for downstream tasks <ref type="bibr" target="#b11">[12,</ref><ref type="bibr" target="#b12">13,</ref><ref type="bibr" target="#b13">14]</ref>. Nevertheless, the generalization of knowledge can result in memory distortion, an inherent limitation that may give rise to potential inaccuracies <ref type="bibr" target="#b14">[15]</ref>. Moreover, their ability to represent knowledge is constrained by model scale and faces challenges in addressing long-tailed knowledge problems <ref type="bibr" target="#b15">[16,</ref><ref type="bibr" target="#b16">17]</ref>. While the privacy and timeliness of data in the real world <ref type="bibr" target="#b17">[18,</ref><ref type="bibr" target="#b18">19]</ref> unfortunately exacerbate this problem, leaving models difficult to maintain a comprehensive and up-todate understanding of the facts. These challenges present a serious obstacle to the reliability of LLMs, which we refer to as hallucination. <ref type="bibr" target="#b19">[20]</ref>. A prominent example of this drawback is that models typically generate statements that appear reasonable but are either cognitively irrelevant or factually incorrect. In light of this observation, hallucinations remain a critical challenge in medical <ref type="bibr" target="#b20">[21,</ref><ref type="bibr" target="#b21">22]</ref>, financial <ref type="bibr" target="#b22">[23]</ref> and other knowledge-intensive fields due to the exacting accuracy requirements. Particularly, the applications for legal case drafting showcase plausible interpretation as an aggregation of diverse subjective perspectives <ref type="bibr" target="#b23">[24]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Definition of Hallucination.</head><p>As depicted in Figure <ref type="figure" target="#fig_0">1</ref>, hallucination refers to the generation of texts or responses that exhibit grammatical correctness, fluency, and authenticity, but deviate from the provided source inputs (faithfulness) or do not align with factual accuracy (factualness) <ref type="bibr" target="#b24">[25]</ref>. In traditional NLP tasks <ref type="bibr" target="#b25">[26]</ref>, hallucinations are often synonymous with faithfulness: conflicting information leads to Intrinsic Hallucination, i.e., LMs conflict with the input information when generating a response; Conversely, generating ambiguous supplementary information may lead to Extrinsic Hallucination, i.e., LMs produce personal names, historical events, or technical documents that are challenging to  verify. LLMs-oriented hallucinations instead prioritize factualness, focusing on whether the result can be evidenced or negated by reference to external facts in the real world. Uncritical trust in LLMs can give rise to a phenomenon Cognitive Mirage, contributing to misguided decision-making and a cascade of unintended consequences <ref type="bibr" target="#b26">[27]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Present work</head><p>To effectively control the risk of hallucinations, we summarize recent progress in hallucination theories and solutions in this paper. We propose to organize relevant work by a comprehensive survey (Figure <ref type="figure">2</ref>):</p><p>• Theoretical insight and mechanism analysis. We provide in-depth theoretical and mechanism analysis from three typical perspectives: data collection, knowledge gap and optimization process, reviewing the recent and relevant theories related to hallucinations ( §2). • Taxonomy of hallucination in LLMs. We conduct a comprehensive review of hallucination in LLMs together with a task axis. We review the task-specific benchmarks with a comprehensive comparison and summary ( §3). • Wide coverage on emerging hallucination detection and correction methods. We propose a comprehensive investigation into the proactive detection ( §4) and mitigation of hallucinations handling language pairs with limited resources or non-English translations <ref type="bibr" target="#b38">[39]</ref>. Furthermore, cuttingedge Large Vision-Language Models (LVLMs) exhibit instances of hallucinating common objects within visual instructional datasets and prone to objects that frequently co-occur in the same image <ref type="bibr" target="#b39">[40,</ref><ref type="bibr" target="#b40">41]</ref>.</p><p>Knowledge Gap Knowledge gaps are typically attributed to differences in input format between the pre-training and fine-tuning stages <ref type="bibr" target="#b41">[42]</ref>. Even when considering the automatic updating of textual knowledge bases, the output can deviate from the expected corrections <ref type="bibr" target="#b42">[43]</ref>. For example, questions often do not align effectively with stored knowledge, and the available information remains unknown until the questions are presented. This knowledge gap poses thorny challenges in balancing memory with retrieved evidence, which is construed as a passive defense mechanism against the misuse of retrieval <ref type="bibr" target="#b43">[44]</ref>. To delve into this issue, <ref type="bibr" target="#b44">[45]</ref> and <ref type="bibr" target="#b45">[46]</ref> propose that disregarding retrieved evidence introduces biased model knowledge, while mis-covering and over-thinking disrupt model behavior. Furthermore, in scenarios where a cache component is utilized to offer historical memory during training <ref type="bibr" target="#b46">[47]</ref>, the model also experiences inconsistency between the present hidden state and the hidden state stored in the cache.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Optimization Process</head><p>The maximum likelihood estimation and teacher-forcing training have the potential to result in a phenomenon known as stochastic parroting <ref type="bibr" target="#b47">[48]</ref>, wherein the model is prompted to imitate the training data without comprehension <ref type="bibr" target="#b48">[49]</ref>. Specifically, exposure bias between the training and testing stages have been demonstrated to lead to hallucinations within LLMs, particularly when generating lengthy responses <ref type="bibr" target="#b49">[50]</ref>. Besides, sampling techniques characterized by high uncertainty <ref type="bibr" target="#b50">[51]</ref>, such as top-p and top-k, exacerbate the issue of hallucination. Furthermore, <ref type="bibr" target="#b26">[27]</ref> observes that LLMs tend to produce snowballing hallucinations to maintain coherence with earlier hallucinations, and even when directed with prompts as "Let's think step by step", they still generate ineffectual chains of reasoning <ref type="bibr" target="#b12">[13]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Taxonomy of Hallucination</head><p>In this paper, we mainly consider representative hallucinations, which are widely observed in various downstream tasks, i.e. Machine Translation, Question and Answer, Dialog System, Summarization System, Knowledge graph with LLMs, and Visual Question Answer. As shown in Table <ref type="table" target="#tab_1">1</ref>, these hallucinations are identified complex taxonomy in numerous mainstream tasks associated with LLMs. In the following sections, we will introduce representative types of hallucinations to be resolved.</p><p>• Machine Translation. Since perturbations (e.g., spellings or capital errors) can induce hallucinations reliably, traditional machine translation models tend to validate instances memorised by the model when subjected to perturbations <ref type="bibr" target="#b86">[87,</ref><ref type="bibr" target="#b87">88]</ref>. It is worth noting that hallucinations generated by LLMs are mainly translation off-target, over-generation, or failed translation attempts <ref type="bibr" target="#b38">[39]</ref>. While in low-resource language setting, most models exhibit subpar performance due to the lack of annotated data <ref type="bibr" target="#b53">[54]</ref>. In contrast, they are vulnerable to increased amount of pre-trained languages in multilingual setting <ref type="bibr" target="#b88">[89]</ref>. Subsequently, familial LLMs trained on different scales of monolingual data are proved to be viscous <ref type="bibr" target="#b38">[39]</ref>, as the source of oscillatory hallucination pathology. • Question and Answer. Imperfect responses suffer from flawed external knowledge, knowledge recall cues and reasoning instruction <ref type="bibr" target="#b41">[42]</ref>. For example, LLMs are mostly unable to avoid answering when provided with no relevant information, instead provide incomplete and plausible answers <ref type="bibr" target="#b55">[56]</ref>. In additon to external knowledge, memorized information without accurate, reliable and accessible source also contributes to different types of hallucinations <ref type="bibr" target="#b21">[22]</ref>. Though scaling laws suggest that perplexity on the training distribution is positively correlated with parameter size, <ref type="bibr" target="#b29">[30]</ref> further discovers that scaling up models should increase the rate of imitative falsehoods.</p><p>• Dialog System. Some studies view dialogue models as unobtrusive imitators, which simulates the distributional properties of data instead of generating faithful output. For example, Uncooperativeness responses <ref type="bibr" target="#b56">[57]</ref> originating from discourse phenomena inclines to output an exact copy of the entire evidence. <ref type="bibr" target="#b57">[58]</ref> reports more nuanced hallucinations in KG-grounded dialogue systems as analyzed through human feedback. Similarly, FaithDial <ref type="bibr" target="#b58">[59]</ref>, BEGIN <ref type="bibr" target="#b59">[60]</ref>, MixCL <ref type="bibr" target="#b60">[61]</ref> all implement experiments on the WoW dataset to conduct a meta-evaluation of the hallucination in knowledge grounded dialogue.</p><p>• Summarization System. Automatically generated abstracts based on LLMs may be fluent, but they still typically lack faithfulness to the source document. Compared to the human evaluation of traditional summarization models <ref type="bibr" target="#b25">[26]</ref>, the summarizations generated by LLMs can be categorized into two major types: intrinsic hallucinations that distort the information present in the document; extrinsic hallucinations that provide additional information that cannot be directly attributed to the document <ref type="bibr" target="#b64">[65]</ref>. Note that extrinsic hallucination as a metrics of factually consistent continuation of inputs in LLMs is given more attention in summarisation systems <ref type="bibr" target="#b61">[62,</ref><ref type="bibr" target="#b63">64]</ref>. Furthermore, <ref type="bibr" target="#b62">[63]</ref> subdivides extrinsic hallucinations into factual and non-factual hallucinations. The former provides additional world knowledge, which may benefit comprehensive understanding.</p><p>• Knowledge Graph with LLMs. Despite the promising progress in knowledge-based text geneartion, it encounters intrinsic hallucinations inherent to the process where the generated text not only covers the input information but also incorporates redundant details derived from its internal memorized knowledge <ref type="bibr" target="#b89">[90]</ref>. To address this, <ref type="bibr" target="#b65">[66]</ref> establish a distinction between correctly generated knowledge and knowledge hallucinations in terms of knowledge creation. Notably, the Virtual Knowledge Extraction <ref type="bibr" target="#b90">[91]</ref> underscores the potential generalization capabilities of LLMs in the realms of constructing and inferring from knowledge graphs. <ref type="bibr" target="#b31">[32]</ref> further empower LLMs to produce interpretable fact-checks through a neural symbolic approach. Based on their fidelity to the source, hallucinations are defined as subject hallucination, relation hallucination, and object hallucination.</p><p>• Cross-modal System. Augmented by the superior language capabilities of LLMs, performance of cross-modal tasks achieves promising progress <ref type="bibr" target="#b91">[92,</ref><ref type="bibr" target="#b39">40]</ref>. However, despite replacing the original language encoder with LLMs, Large Visual Language Models (LVLMs) <ref type="bibr" target="#b92">[93]</ref> still generate object descriptions that not present in the target image, denoted as object hallucinations <ref type="bibr" target="#b40">[41]</ref>. Especially, the various failure cases could be typically found in Visual Question Answering <ref type="bibr" target="#b40">[41,</ref><ref type="bibr" target="#b66">67]</ref>, Image Captioning <ref type="bibr" target="#b93">[94,</ref><ref type="bibr" target="#b94">95,</ref><ref type="bibr" target="#b95">96]</ref>,</p><p>Report Generation <ref type="bibr" target="#b67">[68]</ref> etc.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Hallucination Detection</head><p>Conventional hallucination detection mainly depends on task-specific metrics, such as ROUGE and BLEU to evaluate the information overlap between source and target texts in summarization tasks <ref type="bibr" target="#b96">[97]</ref>, while knowledge F1 to estimate the knowledge-aware ability of response generation <ref type="bibr" target="#b97">[98]</ref>. These metrics focus on measuring faithfulness of references and fail to provide an assessment of factualness. Despite some reference-free works are proposed, plugin-based methods <ref type="bibr" target="#b98">[99]</ref> suffer from world knowledge limitation. QA-based matching metrics <ref type="bibr" target="#b99">[100]</ref> lack knowledge completeness of source information. NLI-based methods <ref type="bibr" target="#b59">[60]</ref> are unable to support finer-grained hallucination checking as they are sentencelevel, besides entailment and hallucination problems are not equivalent. As the paradigm shift in hallucination detection arising from the rapid development of LLMs, we present a novel taxonomy in Fig <ref type="figure" target="#fig_1">3</ref> and introduce each category in following sections.</p><p>• Inference Classifier. The most straightforward strategy involves adopting classifiers to assess the likelihood of hallucinations. Concretely, given a question 𝒬 and an answer 𝒜, an inferential classifier 𝒞 can be asked to determine whether the answer contains hallucinatory content ℋ via computing 𝑝(ℋ) = ℱ 𝒞 (𝒬, 𝒜). Therefore, <ref type="bibr" target="#b63">[64]</ref> employs the state-of-the-art LLMs to do end-to-end text generation of detection results. Some other studies <ref type="bibr" target="#b30">[31]</ref> finds that adding chains of thought indiscriminately may intervene in the final judgement, whereas retrieving the knowledge properly resulted in gains. Furthering this concept, the hinted classifer and explainer <ref type="bibr" target="#b63">[64]</ref>, used to generate intermediate process labels and high-quality natural language explanations, are demonstrated to enhance the final predicted class from a variety of perspectives. Subsequently, <ref type="bibr" target="#b61">[62]</ref> suggests adopting a different classifier model to the generated model, contributing to easier judgement of factual consistency. For radiology report generation, binary classifiers <ref type="bibr" target="#b67">[68]</ref> can be leveraged to measure the reliability by combining image and text embedding. Unlike previous work that employs complex human-crafted rules to evaluate object hallucinations, GAVIE <ref type="bibr" target="#b66">[67]</ref> scores responses towards image content based on both accuracy and relevance criteria, which evaluates the LMMs output in an open-ended manner.</p><p>• Uncertainty Metric. It is important to examine the correlation between the hallucination metric and the quality of output from a variety of perspectives. One intuitive approach is to employ the probabilistic output of the model itself, as ASTSN <ref type="bibr" target="#b74">[75]</ref> calculates the models' uncertainty about the identified concepts by utilising the logit output values. Similarly, BARTSCORE <ref type="bibr" target="#b69">[70]</ref> employs a universal notion that models trained to convert generated text to reference output or source text will score higher when the generated text is superior. It is an unsupervised metric that supports the addition of appropriate prompts to improve the measure design, without human judgement to train. Furthermore, KoK <ref type="bibr" target="#b70">[71]</ref> based on the work of <ref type="bibr" target="#b100">[101]</ref> evaluates answer uncertainty from three categories, i.e., subjectivity, hedges and text uncertainty. However, SLAG <ref type="bibr" target="#b71">[72]</ref> measures consistent factual beliefs in terms of paraphrase, logic, and entailment. In addition to this, KLD <ref type="bibr" target="#b72">[73]</ref> combines information theory-based metrics (e.g., entropy and KL-divergence) to capture knowledge uncertainty. Beside expert-stipulated programmatic supervision, POLAR <ref type="bibr" target="#b73">[74]</ref> introduces Pareto optimal learning assessed risk score for estimating the confidence level of a response.</p><p>• Self-Evaluation. To self-evaluate is challenging since the model might be overconfident about its generated samples being correct. The motivating idea of SelfCheckGPT <ref type="bibr" target="#b76">[77]</ref> is to use the ability of the LLMs themselves to sample multiple responses and identify fictitious statements by measuring the consistency of information among responses. <ref type="bibr" target="#b75">[76]</ref> further illustrates that both the increase in size and the demonstration of assessment can improve self-assessment. Beyond repetitive multiple direct queries, <ref type="bibr" target="#b77">[78]</ref> uses open-ended indirect queries and compares their answers to each other for an agreed-upon score outcome. SelfCk <ref type="bibr" target="#b80">[81]</ref> imposes appropriate constraints on the same LLM to generate pairs of sentences triggering self-contradictions, which prompt the detection. In contrast, Polling-based querying <ref type="bibr" target="#b40">[41]</ref> reduce the complexity of judgement by randomly sampling query objects. Besides, Self-Checker <ref type="bibr" target="#b78">[79]</ref> decomposes complex statements into multiple simple statements, fact-checking them one by one. However, <ref type="bibr" target="#b79">[80]</ref> introduces two LLMs to drive the complex fact-checking reasoning process by crosscheck.</p><p>• Evidence Retrieval. Evidence retrieval accomplishes factual detection by retrieving supporting evidence related to hallucinations. To this end, Designing a claim-centric pipeline allows for a questionretrieve-summary chain to effectively collect original evidence <ref type="bibr" target="#b83">[84,</ref><ref type="bibr" target="#b84">85]</ref>. Consequently, FActScore <ref type="bibr" target="#b82">[83]</ref> calculates the percentage of atomic facts supported by the given knowledge source. Towards adapting the tasks that users in interaction with generative models, FacTool <ref type="bibr" target="#b85">[86]</ref> proposes to integrate a variety of tools into a task-agnostic and domain-agnostic detection framework, in order to assemble evidence about the authenticity of the generated content.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Hallucination Correction</head><p>In this section, we delve into the methods to correct hallucination in terms of different aspects. As shown in Figure <ref type="figure" target="#fig_2">4</ref>, these hallucination correction paradigms have demonstrated strong dominance in many mainstream NLP tasks. Note that these methods are not entirely orthogonal but could complement each other as required by the tasks in practical applications. In the following sections, we will introduce each methods as shown in Figure <ref type="figure">5</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Hallucination Correction</head><p>Parameter Adaptation Factual-Nucleus <ref type="bibr" target="#b50">[51]</ref>, CLR <ref type="bibr" target="#b60">[61]</ref>, Edit-TA <ref type="bibr" target="#b101">[102]</ref>, EWR <ref type="bibr" target="#b102">[103]</ref>, PURR <ref type="bibr" target="#b103">[104]</ref>, mmT5 <ref type="bibr" target="#b54">[55]</ref>, HISTALIGN <ref type="bibr" target="#b46">[47]</ref>, TYE <ref type="bibr" target="#b104">[105]</ref>, ALLM <ref type="bibr" target="#b105">[106]</ref>, TRAC <ref type="bibr" target="#b106">[107]</ref>, Inference-Time <ref type="bibr" target="#b107">[108]</ref>, EasyEdit <ref type="bibr" target="#b108">[109]</ref>, DoLa <ref type="bibr" target="#b109">[110]</ref> Post-hoc Attribution and Edit Technology NP-Hunter <ref type="bibr" target="#b110">[111]</ref>, CoT <ref type="bibr" target="#b13">[14]</ref>, ORCA <ref type="bibr" target="#b111">[112]</ref>, RR <ref type="bibr" target="#b112">[113]</ref>, TRAK <ref type="bibr" target="#b113">[114]</ref>, Data-Portraits <ref type="bibr" target="#b114">[115]</ref>, Self-Refine <ref type="bibr" target="#b115">[116]</ref>, Reflexion <ref type="bibr" target="#b116">[117]</ref>, QUIP <ref type="bibr" target="#b117">[118]</ref>, Verify-and-Edit <ref type="bibr" target="#b118">[119]</ref>, CoVe <ref type="bibr" target="#b119">[120]</ref>, CoNLI <ref type="bibr" target="#b68">[69]</ref> Leverage External Knowledge RETRO <ref type="bibr" target="#b120">[121]</ref>, IRCoT <ref type="bibr" target="#b121">[122]</ref>, POPQA <ref type="bibr" target="#b16">[17]</ref>, LLM-AUGMENTER <ref type="bibr" target="#b122">[123]</ref>, In-Context RALM <ref type="bibr" target="#b61">[62]</ref>, GeneGPT <ref type="bibr" target="#b123">[124]</ref>, cTBL <ref type="bibr" target="#b124">[125]</ref>, CoK <ref type="bibr" target="#b125">[126]</ref>, FLARE <ref type="bibr" target="#b126">[127]</ref>, Gorilla <ref type="bibr" target="#b127">[128]</ref>, RETA-LLM <ref type="bibr" target="#b128">[129]</ref>, KnowledGPT <ref type="bibr" target="#b129">[130]</ref> Assessment Feedback LSHF <ref type="bibr" target="#b130">[131]</ref>, TLM <ref type="bibr" target="#b131">[132]</ref>, BRIO <ref type="bibr" target="#b132">[133]</ref>, LM-know <ref type="bibr" target="#b75">[76]</ref>, Chain-of-Hindsight <ref type="bibr" target="#b133">[134]</ref>, ZEROFEC <ref type="bibr" target="#b42">[43]</ref>, CRITIC <ref type="bibr" target="#b134">[135]</ref>, VIVID <ref type="bibr" target="#b95">[96]</ref>, LMH-Snowball <ref type="bibr" target="#b26">[27]</ref>, MixAlign <ref type="bibr" target="#b44">[45]</ref>, REFEED <ref type="bibr" target="#b14">[15]</ref>, PaD <ref type="bibr" target="#b135">[136]</ref>, ALCE <ref type="bibr" target="#b136">[137]</ref>, Do-LLM-Know <ref type="bibr" target="#b77">[78]</ref>, CRL <ref type="bibr" target="#b137">[138]</ref>, SR <ref type="bibr" target="#b138">[139]</ref> Mindset Society HLMTM <ref type="bibr" target="#b38">[39]</ref>, Multiagent-Debate <ref type="bibr" target="#b139">[140]</ref>, MAD <ref type="bibr" target="#b140">[141]</ref>, FORD <ref type="bibr" target="#b141">[142]</ref>, LM-vs-LM <ref type="bibr" target="#b79">[80]</ref>, PRD <ref type="bibr" target="#b142">[143]</ref>, SPP <ref type="bibr" target="#b143">[144]</ref> Figure <ref type="figure">5</ref>: Taxonomy of Hallucination Correction.</p><p>• Parameter Adaptation. Parameters in LLMs store biases learned in pre-training, are often unaligned with user intent. A cutting-edge strategy is to guide effective knowledge through parameter conditioning, editing, and optimisation. For example, CLR <ref type="bibr" target="#b60">[61]</ref> optimises to reduce the generation probability of negative samples at span level utilising contrastive learning parameters. While introducing contextual knowledge background that contradicts the model's intrinsic prior knowledge, TYE <ref type="bibr" target="#b104">[105]</ref> effectively reduces the weight of prior knowledge through context-aware decoding method. Besides, PURR <ref type="bibr" target="#b103">[104]</ref> corrupts noise into the text, fine-tune compact editors, and denoise by merging relevant evidence. To introduce additional cache component, HISTALIGN <ref type="bibr" target="#b46">[47]</ref> discovers that its hidden state is not aligned with the current hidden state, and proposes sequence information contrastive learning to improve the reliability of memory parameters. Consequently, Edit-TA <ref type="bibr" target="#b101">[102]</ref> mitigates the biases learnt in pre-training from a task algorithm perspective. An intuition behind it is that parameter variations learnt through negative example tasks could be perceived through weight variances. However as this fails to take the importance of different negative examples into account, therefore EWR <ref type="bibr" target="#b102">[103]</ref> proposes Fisher information matrices to measure the uncertainty of their estimation, which is applied for the dialogue systems to execute a parameter interpolation and remove hallucination. EasyEdit <ref type="bibr" target="#b108">[109]</ref> summarises methods for parameter editing, while minimising the influence to irrelevant parameter. An efficient alternative is to identify task-specific parameters and exploit them. For example, ALLM <ref type="bibr" target="#b105">[106]</ref> aligns the parameter module with task-specific knowledge, and then generates the relevant knowledge as additional context in background augmented prompts. Similarly, mmT5 <ref type="bibr" target="#b54">[55]</ref> utilises language-specific modules during pre-training to separate language-specific information from languageindependent information, demonstrating that adding language-specific modules can alleviate the curse of multilinguality. Instead, TRAC <ref type="bibr" target="#b106">[107]</ref> combines conformal prediction and global testing to augment retrieval-based QA. The conservative strategy formulation ensures that a semantically equivalent answer to the truthful answer is included in the prediction set.</p><p>Another parameter adaptation idea focuses on flexible sampling consistent with user requirements. For instance, <ref type="bibr" target="#b50">[51]</ref> observes that the randomness of sampling is more detrimental to factuality when generating the latter part of a sentence. The factual-nucleus sampling algorithm is introduced to keep the faithfulness of the generation while ensuring the quality and diversity. Besides, Inference-Time <ref type="bibr" target="#b107">[108]</ref> firstly identifies a set of attentional heads with high linear probing accuracy, and then shifts activation in the inference process along the direction associated with factual knowledge.</p><p>• Post-hoc Attribution and Edit Technology. A source of hallucination is that LLMs may leverage the patterns observed in the pre-training data for inference in a novel form. Recently, ORCA <ref type="bibr" target="#b111">[112]</ref> reveals problematic patterns in the behaviour of models by probing supporting data evidences from pre-training data. Similarly, TRAK <ref type="bibr" target="#b113">[114]</ref> and Data-Portraits <ref type="bibr" target="#b114">[115]</ref> analyse whether models plagiarise or reference existing resources by means of data attribution. QUIP <ref type="bibr" target="#b117">[118]</ref> further demonstrates that providing text that has been observed in the pre-training phase can improve the ability of LLMs to generate more factual information. Furthermore, motivated by the gap between LLMs and human modes of thinking, one intuition is to align the two modes of reasoning. Thus CoT <ref type="bibr" target="#b13">[14]</ref> elicits faithful reasoning via a kind of Chain-of-Thought (CoT) <ref type="bibr" target="#b12">[13]</ref> prompts. Similarly, RR <ref type="bibr" target="#b112">[113]</ref> retrieves relevant external knowledge based on decomposed reasoning steps obtained from a CoT prompt. Since LLMs do not produce the best output on the first attempt, Self-Refine <ref type="bibr" target="#b115">[116]</ref> implements self-refinement algorithms through iterative feedback and improvement. Reflexion <ref type="bibr" target="#b116">[117]</ref> also employs verbal reinforcement to generate reflective feedback by learning about prior failings. Verify-and-Edit <ref type="bibr" target="#b118">[119]</ref> proposes a CoT-prompted verify-and-edit framework, which improves the fidelity of predictions by post-editing the inference chain based on externally retrieved knowledge. CoVe <ref type="bibr" target="#b119">[120]</ref> emphasises the importance of independent self-verification to prevent being influenced by other responses. Another source of hallucinations is to describe factual content with incorrect retrievals. To recify this, NP-Hunter <ref type="bibr" target="#b110">[111]</ref> follows a generatethen-refine strategy whereby a generated response is amended using the KG so that the dialogue system is able to correct potential hallucinations by querying the KG.</p><p>• Leverage External Knowledge. As an attempt to extend the language model for halucination mitigation, a suggestion is to retrieve relevant documents from large textual databases. RETRO <ref type="bibr" target="#b120">[121]</ref> splits the input sequence into chunks and retrieves similar documents, while In-Context RALM <ref type="bibr" target="#b61">[62]</ref> places the selected document before the input text to improve the prediction. Furthermore, IRCoT <ref type="bibr" target="#b121">[122]</ref> interweaves CoT generation and document retrieval steps to guide LLMs. LLM-AUGMENTER <ref type="bibr" target="#b122">[123]</ref> also bases the responses of LLMs on integrated external knowledge and automated feedback to improve the truthfulness score of the answers. Another work, CoK <ref type="bibr" target="#b125">[126]</ref> iteratively analyses future content of upcoming sentences, and then applies them as a query to retrieve relevant documents for the purposes of re-generating sentences when they contain low confidence tokens. Similarly, RETA-LLM <ref type="bibr" target="#b128">[129]</ref> creates a complete pipeline to assist users in building their own domain-based LLM retrieval systems. Note that in addition to document retrieval, diverse external knowledge queries coule be assembled into retrievalaugmented LLM systems. For example, FLARE <ref type="bibr" target="#b126">[127]</ref> leverages structured knowledge bases to support complex queries and provide more straightforward factual statements. Further, KnowledGPT <ref type="bibr" target="#b129">[130]</ref> adopts program of thoughts (PoT) prompting, which generates codes to interact with knowledge bases. While cTBL <ref type="bibr" target="#b124">[125]</ref> proposes to enhance LLMs with tabular data in conversation settings. Besides, GeneGPT <ref type="bibr" target="#b123">[124]</ref> demonstrates that expertise can be accessed more easily and accurately by detecting and executing API calls through contextual learning and augmented decoding algorithms. To support potentially millions of ever-changing APIs, Gorilla <ref type="bibr" target="#b127">[128]</ref> explores self-instruct fine-tuning and retrieval for efficient API exploitation.</p><p>• Assessment Feedback. As language models become more sophisticated, evaluation feedback can significantly improve the quality of generated text, as well as reduce the appearance of hallucinations. To realise this concept, LSHF <ref type="bibr" target="#b130">[131]</ref>,TLM <ref type="bibr" target="#b131">[132]</ref> and Chain-of-Hindsight <ref type="bibr" target="#b133">[134]</ref> predict human preferences through reinforcement learning and employs this as the reward function. In addition to enabling the model to learn directly from the feedback of factual metrics in a sample-efficient manner <ref type="bibr" target="#b137">[138]</ref>, it is also important to build in a self-evaluation function of the model to filter candidate generated texts. For example, BRIO <ref type="bibr" target="#b132">[133]</ref> empowers summarization model assessment, estimating probability distributions of candidate outputs to rate the quality of candidate summaries. While LM-know <ref type="bibr" target="#b75">[76]</ref> is devoted to investigating whether LLMs can evaluate the validity of their own claims by detecting the probability that they know the answer to a question. Consequently, Do-LLM-Know <ref type="bibr" target="#b77">[78]</ref> queries exclusively with black-box LLMs, and the results of queries repeatedly generated multiple times are compared with each other to pass consistency checks. As missing citation quality evaluation affects the final performance, ALCE <ref type="bibr" target="#b136">[137]</ref> employs a natural language reasoning model to measure citation quality and extends the integrated retrieval system. Similarly, CRITIC <ref type="bibr" target="#b134">[135]</ref> proposes to interact with appropriate tools to assess certain aspects of the text, and then to modify the output based on the feedback obtained during the verification process. Note that automated error checking can also utilise LLMs to generate text that conforms to tool interfaces. PaD <ref type="bibr" target="#b135">[136]</ref> distills the LLMs with a synthetic inference procedure, and the synthesis program obtained can be automatically compiled and executed by an explainer. Further, iterative refinement processes are validated to effectively identify appropriate details <ref type="bibr" target="#b95">[96,</ref><ref type="bibr" target="#b44">45,</ref><ref type="bibr" target="#b14">15]</ref>, and can stop early invalid reasoning chains, beneficially reducing the phenomenon of hallucination snowballing <ref type="bibr" target="#b26">[27]</ref>.</p><p>• Mindset Society. Human intelligence thrives on the concept of cognitive synergy, where collaboration between different cognitive processes produces better results than isolated individual cognitive processes. "Society of minds" <ref type="bibr" target="#b144">[145]</ref> is believed to have the potential to significantly improve the performance of LLMs and pave the way for consistency in language production and comprehension. For the purpose of addressing hallucinations in large-scale multilingual models across different translation scenarios, HLMTM <ref type="bibr" target="#b38">[39]</ref> proposes a hybrid setting in which other translation systems can be requested to act as a back-up system when the original system is hallucinating. Consequently, Multiagent-Debate <ref type="bibr" target="#b139">[140]</ref> employs multiple LLMs in several rounds to propose and debate their individual responses and reasoning processes to reach a consensus final answer. As a result of this process, the models are encouraged to construct answers that are consistent with both internal criticisations and responses from other agents. Before a final answer is presented, the resultant community of models can hold and maintain multiple reasoning chains and possible answers simultaneously. Based on this idea, MAD <ref type="bibr" target="#b140">[141]</ref> adds a judge-managed debate process, demonstrating that adaptive interruptions of debate and controlled "titfor-tat" states help to complete factual debates. Furthermore, FORD <ref type="bibr" target="#b141">[142]</ref> proposes roundtable debates that include more than two LLMs and emphasises that competent judges are essential to dominate the debate. LM-vs-LM <ref type="bibr" target="#b79">[80]</ref> also proposes multi-round interactions between LM and another LM to check the factualness of original statements. Besides, PRD <ref type="bibr" target="#b142">[143]</ref> proposes a peer rank and discussionbased evaluation framework to arrive at a well-recognised assessment result that all peers are in agreement with. In an effort to maintain strong reasoning, SPP <ref type="bibr" target="#b143">[144]</ref> utilises LLMs to assign several fine-grained roles, which effectively stimulates knowledge acquisition and reduces hallucinations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Future Directions</head><p>Though numerous technical solutions have been proposed in the survey for hallucinations in LLMs, there exist some potential directions:</p><p>• Data Construction Management. As previously discussed, the style, and knowledge of LLMs is basically learned during model pre-training. High quality data present promising opportunities for facilitating the reduction of hallucinations in LLMs <ref type="bibr" target="#b145">[146]</ref>. Inspired by the basic rule of machine learning models "Garbage input, garbage output", <ref type="bibr" target="#b146">[147]</ref> proposes that data quality and diversity outweigh the importance of fine-tuning large-scale instructions <ref type="bibr" target="#b147">[148,</ref><ref type="bibr" target="#b2">3,</ref><ref type="bibr" target="#b148">149]</ref> and RLHF <ref type="bibr" target="#b5">[6,</ref><ref type="bibr" target="#b1">2]</ref>. To perform efficiently in knowledge-intensive verticals, we argue that construction of entity-centred fine-tuned instructions <ref type="bibr" target="#b149">[150,</ref><ref type="bibr" target="#b150">151,</ref><ref type="bibr" target="#b151">152]</ref> is a promising direction that it can enhance the factuality of generated entity information. Another feasible proposal is to incorporate a self-curation phase <ref type="bibr" target="#b152">[153]</ref> in the instruction construction process to rate the quality of candidate pairs. During the iteration process, quality evaluation <ref type="bibr" target="#b153">[154]</ref> based on manual or automated rule constraints could provide self-correction capacity.</p><p>• Reasoning Mechanism Exploitation. The emerging CoT technique <ref type="bibr" target="#b13">[14]</ref> stimulates the emergent reasoning ability of LLMs by imitating intrinsic stream of thought. Recently, A primary improvement is ToT <ref type="bibr" target="#b154">[155]</ref> introduces tree and into the thought process, and provides a novel backtrack function. However, the actual thinking process creates a complex network of ideas, as an example, people could explore a particular chain of reasoning, backtrack or start a new chain of reasoning. GoT <ref type="bibr" target="#b155">[156]</ref> extends the dependencies between thoughts by constructing vertices with multiple incoming edges to aggregate arbitrary thoughts. Since previous methods have no storages for intermediate results, CR <ref type="bibr" target="#b155">[156]</ref> uses cumulative and iterative manners to simulate human thought processes, and decompose the task into smaller components. In addition to self-heuristic methods, PAL <ref type="bibr" target="#b156">[157]</ref> and PoT <ref type="bibr" target="#b157">[158]</ref> introduce programming logic into the language space <ref type="bibr" target="#b158">[159]</ref>, expanding the ability to invoke external explainers. As a summary, research based on human cognition helps to provide brilliant insights into the analysis of hallucinations, such as Dual Process Theory <ref type="bibr" target="#b159">[160]</ref>, Three layer mental model <ref type="bibr" target="#b160">[161]</ref>, Computational Theory of Mind <ref type="bibr" target="#b161">[162]</ref>, and Connectionism <ref type="bibr" target="#b162">[163]</ref>.</p><p>• Multi-modal Hallucination Survey. It has become a community consensus to establish powerful Multimodal Large Language Models (MLLMs) <ref type="bibr" target="#b163">[164,</ref><ref type="bibr" target="#b164">165,</ref><ref type="bibr" target="#b165">166]</ref> by taking advantage of excellent comprehension and reasoning capabilities of LLMs. <ref type="bibr" target="#b40">[41]</ref> confirms the severity of hallucinations in MLLM by object detecting and polling-based querying. The results indicate that they are highly susceptible to object hallucination, and the generated description does not match the target image. Besides, <ref type="bibr" target="#b166">[167]</ref> that MLLMs have limited multimodal reasoning ability as well as dependence on spurious cues. Though current study <ref type="bibr" target="#b167">[168]</ref> provides a broad overview of MLLMs, the causation of hallucinations has not been comprehensively investigated. In the future, as more sophisticated multi-model applications emerge, in-depth analyses of the biased distribution resulting from misalignment among modes is a promising research direction, to provide faithful modal interactions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">Conclusion and Vision</head><p>In this paper, we provide an overview of hallucinations in LLMs with new taxonomy, theoretical insight, detection methods, correction methods and several future research directions. Note that it is crucial to utilize LLMs in a responsible and beneficial manner. Furthermore, with sophisticated and efficient detection methods proposed for various aspects, LLMs will provide human reliable and secure information in broad application scenarios.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Illustration of Hallucination in LLMs. While the initial response appears fluent, it fails to align with the world knowledge retrieved from the external knowledge base.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Taxonomy of Hallucination Detection.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Sankey diagram of hallucination correction methods with different mainstream NLP tasks.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head></head><label></label><figDesc>In which sport did the Czech stars Daniel Vacek and Hana Mandlíková gain professional status? Daniel Vacek and Hana Mandlíková both gained professional status in cricket. ◆Daniel Vacek (born 1 April 1971) is a former tennis player from Czechoslovakia and the Czech Republic who turned professional in 1990.</figDesc><table><row><cell>◆Hana Mandlíková (born 19 February 1962) is a</cell></row><row><cell>former professional tennis player from</cell></row><row><cell>Czechoslovakia who later obtained Australian</cell></row><row><cell>citizenship.</cell></row><row><cell>Daniel Vacek and Hana Mandlíková both gained</cell></row><row><cell>professional status in tennis.</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 1</head><label>1</label><figDesc>List of Representative Hallucination</figDesc><table><row><cell>Paper</cell><cell>Task</cell><cell cols="3">Architecture Resources</cell><cell cols="2">Hallucination Types</cell><cell cols="2">Research Method</cell></row><row><cell>Raunak et al. [52]</cell><cell>Machine Transla-</cell><cell>Enc-Dec</cell><cell cols="2">IWSLT-2014</cell><cell cols="2">Under perturbation, Natural hal-</cell><cell cols="2">Source perturbation</cell></row><row><cell></cell><cell>tion</cell><cell></cell><cell></cell><cell></cell><cell>lucination</cell><cell></cell><cell></cell></row><row><cell cols="2">Guerreiro et al. [53] Machine Transla-</cell><cell>Enc-Dec</cell><cell>WMT2018</cell><cell></cell><cell cols="2">Oscillatory hallucination, Largely</cell><cell cols="2">Consider a natural sce-</cell></row><row><cell></cell><cell>tion</cell><cell></cell><cell></cell><cell></cell><cell cols="2">fluent hallucination</cell><cell>nario</cell></row><row><cell>Dale et al. [54]</cell><cell>Machine Transla-</cell><cell>Enc-Dec</cell><cell cols="2">FLORES-200, Jig-</cell><cell cols="2">Full hallucination, Partial halluci-</cell><cell cols="2">Introduce pathology</cell></row><row><cell></cell><cell>tion</cell><cell></cell><cell cols="2">saw, Wikipedia</cell><cell cols="2">nation, Word-level hallucination</cell><cell>detection</cell></row><row><cell>Pfeiffer et al. [55]</cell><cell>Multilingual</cell><cell>Enc-Dec</cell><cell cols="2">XQuAD, TyDi,</cell><cell cols="2">Source language hallucination</cell><cell cols="2">Evaluate source lan-</cell></row><row><cell></cell><cell>Seq2seq</cell><cell></cell><cell cols="2">XNLI, XL-Sum,</cell><cell></cell><cell></cell><cell cols="2">guage hallucination</cell></row><row><cell></cell><cell></cell><cell></cell><cell>MASSIVE</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>Lin et al. [30]</cell><cell>Question and An-</cell><cell>Enc-Dec,</cell><cell cols="2">TruthfulQA</cell><cell cols="2">Imitative falsehoods</cell><cell cols="2">Cause imitative false-</cell></row><row><cell></cell><cell>swer</cell><cell>Only-Dec</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell>hoods</cell></row><row><cell>Zheng et al. [42]</cell><cell>Question and An-</cell><cell>Only-Dec</cell><cell cols="2">HotpotQA,</cell><cell>Comprehension,</cell><cell>Factualness,</cell><cell cols="2">Manual analysis of re-</cell></row><row><cell></cell><cell>swer</cell><cell></cell><cell>BoolQ</cell><cell></cell><cell cols="2">Specificity, Inference Hallucina-</cell><cell>sponses</cell></row><row><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell>tion</cell><cell></cell><cell></cell></row><row><cell>Adlakha et al. [56]</cell><cell>Question and An-</cell><cell>Enc-Dec,</cell><cell cols="2">NQ, HotpotQA,</cell><cell cols="2">Semantic equivalence, Symbolic</cell><cell cols="2">Evaluate retrieval aug-</cell></row><row><cell></cell><cell>swer</cell><cell>Only-Dec</cell><cell cols="2">TopiOCQA</cell><cell cols="2">equivalence, Intrinsic ambiguity,</cell><cell cols="2">mented QA</cell></row><row><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell cols="2">Granularity discrepancies, Incom-</cell><cell></cell></row><row><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell cols="2">plete, Enumeration, Satisfactory</cell><cell></cell></row><row><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell>Subset</cell><cell></cell><cell></cell></row><row><cell cols="2">Umapathi et al. [22] Question and An-</cell><cell>Only-Dec</cell><cell cols="2">MEDMCQA,</cell><cell>Reasoning</cell><cell>hallucination,</cell><cell cols="2">Medical benchmark</cell></row><row><cell></cell><cell>swer</cell><cell></cell><cell>Headqa,</cell><cell>US-</cell><cell cols="2">Memory-based hallucination</cell><cell cols="2">Med-HALT</cell></row><row><cell></cell><cell></cell><cell></cell><cell cols="2">MILE, Medqa,</cell><cell></cell><cell></cell><cell></cell></row><row><cell></cell><cell></cell><cell></cell><cell>Pubmed</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>Dziri et al. [57]</cell><cell>Dialog System</cell><cell>Enc-Dec,</cell><cell>WoW,</cell><cell>CMU-</cell><cell cols="2">Hallucination, Partial hallucina-</cell><cell>Infer</cell><cell>exclusively</cell></row><row><cell></cell><cell></cell><cell>Only-Dec</cell><cell>DOG,</cell><cell>Topi-</cell><cell cols="2">tion, Generic, Uncooperative</cell><cell cols="2">from the knowledge-</cell></row><row><cell></cell><cell></cell><cell></cell><cell>calChat</cell><cell></cell><cell></cell><cell></cell><cell>snippet</cell></row><row><cell>Das et al. [58]</cell><cell>Dialog System</cell><cell>Only-Dec</cell><cell cols="2">OpenDialKG</cell><cell cols="2">Extrinsic-Soft/Hard/ Grouped,</cell><cell cols="2">Analyze entity-level</cell></row><row><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell cols="2">Intrinsic-Soft/ Hard/Repetitive,</cell><cell cols="2">fact hallucination</cell></row><row><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell>History Corrupted</cell><cell></cell><cell></cell></row><row><cell>Dziri et al. [59]</cell><cell>Dialog System</cell><cell>Enc-Dec,</cell><cell>WoW</cell><cell></cell><cell cols="2">Hallucination, Generic, Uncooper-</cell><cell cols="2">Hallucination-free</cell></row><row><cell></cell><cell></cell><cell>Only-Dec</cell><cell></cell><cell></cell><cell>ativeness</cell><cell></cell><cell cols="2">benchmark FaithDial</cell></row><row><cell>Dziri et al. [60]</cell><cell>Dialog System</cell><cell>Enc-Dec,</cell><cell>WoW,</cell><cell>CMU-</cell><cell cols="2">Fully attributable, Not at-</cell><cell cols="2">Knowledge-grounded</cell></row><row><cell></cell><cell></cell><cell>Only-Enc,</cell><cell>DOG,</cell><cell>Topi-</cell><cell>tributable, Generic</cell><cell></cell><cell cols="2">interaction</cell><cell>bench-</cell></row><row><cell></cell><cell></cell><cell>Only-Dec</cell><cell>calChat</cell><cell></cell><cell></cell><cell></cell><cell cols="2">mark Begin</cell></row><row><cell>Sun et al. [61]</cell><cell>Dialog System</cell><cell>Enc-Dec,</cell><cell>WoW</cell><cell></cell><cell cols="2">Intrinsic hallucination, Extrinsic</cell><cell cols="2">Sample responses for</cell></row><row><cell></cell><cell></cell><cell>Only-Dec</cell><cell></cell><cell></cell><cell>hallucination</cell><cell></cell><cell cols="2">conversation</cell></row><row><cell>Tam et al. [62]</cell><cell>Summarization</cell><cell>Enc-Dec,</cell><cell cols="6">CNN/DM, XSum Factually inconsistent summaries Generate summaries</cell></row><row><cell></cell><cell>System</cell><cell>Only-Dec</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell cols="2">from given models</cell></row><row><cell>Cao et al. [63]</cell><cell>Summarization</cell><cell>Enc-Dec,</cell><cell>MENT</cell><cell></cell><cell cols="2">Non-hallucinated, Factual halluci-</cell><cell cols="2">Label factual entities</cell></row><row><cell></cell><cell>System</cell><cell>Only-Dec</cell><cell></cell><cell></cell><cell cols="2">nation, Non-factual hallucination,</cell><cell cols="2">from summarizations</cell></row><row><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell cols="2">Intrinsic hallucination</cell><cell></cell></row><row><cell>Shen et al. [64]</cell><cell>Summarization</cell><cell>Enc-Dec,</cell><cell>NHNet</cell><cell></cell><cell cols="2">News headline hallucination</cell><cell cols="2">Majority vote of jour-</cell></row><row><cell></cell><cell>System</cell><cell>Only-Enc</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell cols="2">nalism degree holders</cell></row><row><cell>Qiu et al. [65]</cell><cell>Summarization</cell><cell>Multiple</cell><cell>XL-Sum</cell><cell></cell><cell cols="2">Intrinsic hallucination, Extrinsic</cell><cell cols="2">In a cross-lingual</cell></row><row><cell></cell><cell>System</cell><cell>ADapters</cell><cell></cell><cell></cell><cell>hallucination</cell><cell></cell><cell cols="2">transfer setting</cell></row><row><cell>Yu et al. [66]</cell><cell>Knowledge-based</cell><cell>Enc-Dec,</cell><cell cols="2">Encyclopedic,</cell><cell cols="2">Knowledge hallucination</cell><cell cols="2">Evaluate knowledge</cell></row><row><cell></cell><cell>text generation</cell><cell>Only-Dec</cell><cell>ETC</cell><cell></cell><cell></cell><cell></cell><cell cols="2">creating ability given</cell></row><row><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell cols="2">known facts</cell></row><row><cell>Mihindukulasooriya</cell><cell>Knowledge graph</cell><cell>Only-Dec</cell><cell>TekGen,</cell><cell></cell><cell cols="2">Subject hallucination, relation hal-</cell><cell>Ontology</cell><cell>driven</cell></row><row><cell>et al. [32]</cell><cell>generation</cell><cell></cell><cell>WebNLG</cell><cell></cell><cell cols="2">lucination, object hallucination</cell><cell>KGC</cell><cell>benchmark</cell></row><row><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell cols="2">Text2KGBench</cell></row><row><cell>Li et al. [41]</cell><cell>Visual Question</cell><cell>Enc-Dec</cell><cell>MSCOCO</cell><cell></cell><cell cols="2">Object hallucination</cell><cell cols="2">Caption hallucination</cell></row><row><cell></cell><cell>Answer</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell cols="2">assessment</cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Language models are few-shot learners</title>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">B</forename><surname>Brown</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Mann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Ryder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Subbiah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kaplan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Dhariwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Neelakantan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Shyam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Sastry</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Askell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Agarwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Herbert-Voss</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Krueger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Henighan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Child</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ramesh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">M</forename><surname>Ziegler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Winter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Hesse</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Sigler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Litwin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gray</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Chess</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Clark</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Berner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mccandlish</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Radford</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Sutskever</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Amodei</surname></persName>
		</author>
		<ptr target="https://proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html" />
	</analytic>
	<monogr>
		<title level="m">NeurIPS 2020</title>
				<editor>
			<persName><forename type="first">H</forename><surname>Larochelle</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Ranzato</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Hadsell</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Balcan</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Lin</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">Training language models to follow instructions with human feedback</title>
		<author>
			<persName><forename type="first">L</forename><surname>Ouyang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Almeida</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">L</forename><surname>Wainwright</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Mishkin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Agarwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Slama</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ray</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Schulman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hilton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Kelton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Miller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Simens</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Askell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Welinder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">F</forename><surname>Christiano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Leike</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Lowe</surname></persName>
		</author>
		<ptr target="http://papers.nips.cc/paper_files/paper/2022/hash/b1efde53be364a73914f58805a001731-Abstract-Conference.html" />
		<imprint>
			<date type="published" when="2022">2022</date>
			<publisher>NeurIPS</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Finetuned language models are zero-shot learners</title>
		<author>
			<persName><forename type="first">J</forename><surname>Wei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Bosma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">Y</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Guu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">W</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Lester</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Du</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Dai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><forename type="middle">V</forename><surname>Le</surname></persName>
		</author>
		<ptr target="https://openreview.net/forum?id=gEZrGCozdqR" />
	</analytic>
	<monogr>
		<title level="m">The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event</title>
				<meeting><address><addrLine>OpenReview</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">April 25-29, 2022. 2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><surname>Chowdhery</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Narang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Bosma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Mishra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Roberts</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Barham</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">W</forename><surname>Chung</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Sutton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gehrmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Schuh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Shi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Tsvyashchenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Maynez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Barnes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tay</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Shazeer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Prabhakaran</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Reif</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Du</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Hutchinson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Pope</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Bradbury</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Austin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Isard</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Gur-Ari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Yin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Duke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Levskaya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ghemawat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Dev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Michalewski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Garcia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Misra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Robinson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Fedus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Ippolito</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Luan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Lim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Zoph</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Spiridonov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Sepassi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Dohan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Agrawal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Omernick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Dai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">S</forename><surname>Pillai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Pellat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Lewkowycz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Moreira</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Child</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Polozov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Saeta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Diaz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Firat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Catasta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Meier-Hellstern</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Eck</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Dean</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Petrov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Fiedel</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2204.02311</idno>
		<idno type="arXiv">arXiv:2204.02311</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2204.02311.doi:10.48550/arXiv.2204" />
		<title level="m">Palm: Scaling language modeling with pathways</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<author>
			<persName><forename type="first">H</forename><surname>Touvron</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Lavril</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Izacard</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Martinet</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lachaux</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Lacroix</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Rozière</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Hambro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Azhar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rodriguez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Joulin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Grave</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Lample</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2302.13971</idno>
		<idno type="arXiv">arXiv:2302.13971</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2302.13971.doi:10.48550/arXiv" />
		<title level="m">Llama: Open and efficient foundation language models</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m" type="main">Training a helpful and harmless assistant with reinforcement learning from human feedback</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Bai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Jones</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Ndousse</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Askell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Dassarma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Drain</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Fort</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Ganguli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Henighan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Joseph</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kadavath</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kernion</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Conerly</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">E</forename><surname>Showk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Elhage</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Hatfield-Dodds</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Hernandez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Hume</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Johnston</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kravec</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Lovitt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Nanda</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Olsson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Amodei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">B</forename><surname>Brown</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Clark</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mccandlish</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Olah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Mann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kaplan</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2204.05862</idno>
		<idno type="arXiv">arXiv:2204.05862</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2204.05862.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Roller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Artetxe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Dewan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">T</forename><surname>Diab</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><forename type="middle">V</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Mihaylov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ott</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Shleifer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Shuster</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Simig</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">S</forename><surname>Koura</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sridhar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2205.01068</idno>
		<idno type="arXiv">arXiv:2205.01068</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2205.01068.doi:10.48550/arXiv.2205" />
		<title level="m">OPT: open pre-trained transformer language models</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">GLM-130B: an open bilingual pre-trained model</title>
		<author>
			<persName><forename type="first">A</forename><surname>Zeng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Du</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Lai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ding</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Zheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Xia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">L</forename><surname>Tam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Xue</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zhai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Dong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tang</surname></persName>
		</author>
		<ptr target="https://openreview.net/pdf?id=-Aw0rrrPUF" />
	</analytic>
	<monogr>
		<title level="m">The Eleventh International Conference on Learning Representations, ICLR 2023</title>
				<meeting><address><addrLine>Kigali, Rwanda</addrLine></address></meeting>
		<imprint>
			<publisher>OpenReview</publisher>
			<date type="published" when="2023">May 1-5, 2023. 2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">Wizardlm: Empowering large language models to follow complex instructions</title>
		<author>
			<persName><forename type="first">C</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Zheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Geng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Feng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Tao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Jiang</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2304.12244</idno>
		<idno type="arXiv">arXiv:2304.12244</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2304.12244.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Pre-trained models: Past, present and future</title>
		<author>
			<persName><forename type="first">X</forename><surname>Han</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Ding</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Gu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Huo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Qiu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Han</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Jin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Lan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Qiu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Song</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Yuan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">X</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zhu</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.aiopen.2021.08.002</idno>
		<ptr target="https://doi.org/10.1016/j.aiopen.2021.08.002.doi:10.1016/j.aiopen.2021.08.002" />
	</analytic>
	<monogr>
		<title level="j">AI Open</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page" from="225" to="250" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Towards reasoning in large language models: A survey</title>
		<author>
			<persName><forename type="first">J</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">C</forename><surname>Chang</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2023.findings-acl.67</idno>
		<ptr target="https://doi.org/10.18653/v1/2023.findings-acl.67.doi:10.18653/v1/2023.findings-acl.67" />
	</analytic>
	<monogr>
		<title level="m">Findings of ACL 2023, ACL</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Rogers</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><forename type="middle">L</forename><surname>Boyd-Graber</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Okazaki</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="1049" to="1065" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Chatgpt vs human-authored text: Insights into controllable text summarization and sentence style transfer</title>
		<author>
			<persName><forename type="first">D</forename><surname>Pu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Demberg</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2023.acl-srw.1</idno>
		<ptr target="https://doi.org/10.18653/v1/2023.acl-srw.1.doi:10.18653/v1/2023.acl-srw.1" />
	</analytic>
	<monogr>
		<title level="m">ACL 2023, ACL</title>
				<editor>
			<persName><forename type="first">V</forename><surname>Padmakumar</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><surname>Vallejo</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Y</forename><surname>Fu</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="1" to="18" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<title level="m" type="main">Large language models are zeroshot reasoners</title>
		<author>
			<persName><forename type="first">T</forename><surname>Kojima</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">S</forename><surname>Gu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Reid</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Matsuo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Iwasawa</surname></persName>
		</author>
		<ptr target="http://papers.nips.cc/paper_files/paper/2022/hash/8bb0d291acd4acf06ef112099c16f326-Abstract-Conference.html" />
		<imprint>
			<date type="published" when="2022">2022</date>
			<publisher>NeurIPS</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<title level="m" type="main">Chain-of-thought prompting elicits reasoning in large language models</title>
		<author>
			<persName><forename type="first">J</forename><surname>Wei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Schuurmans</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Bosma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Ichter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Xia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">H</forename><surname>Chi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><forename type="middle">V</forename><surname>Le</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Zhou</surname></persName>
		</author>
		<ptr target="http://papers.nips.cc/paper_files/paper/2022/hash/9d5609613524ecf4f15af0f7b31abca4-Abstract-Conference.html" />
		<imprint>
			<date type="published" when="2022">2022</date>
			<publisher>NeurIPS</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<title level="m" type="main">Improving language models via plug-and-play retrieval feedback</title>
		<author>
			<persName><forename type="first">W</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Liang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sabharwal</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.14002</idno>
		<idno type="arXiv">arXiv:2305.14002</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.14002" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Large language models struggle to learn long-tail knowledge</title>
		<author>
			<persName><forename type="first">N</forename><surname>Kandpal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Deng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Roberts</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Wallace</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Raffel</surname></persName>
		</author>
		<ptr target="https://proceedings.mlr.press/v202/kandpal23a.html" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of Machine Learning Research</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Krause</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">E</forename><surname>Brunskill</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K</forename><surname>Cho</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">B</forename><surname>Engelhardt</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Sabato</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Scarlett</surname></persName>
		</editor>
		<meeting>Machine Learning Research<address><addrLine>PMLR</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">202</biblScope>
			<biblScope unit="page" from="15696" to="15707" />
		</imprint>
	</monogr>
	<note>ICML 2023</note>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">When not to trust language models: Investigating effectiveness of parametric and non-parametric memories</title>
		<author>
			<persName><forename type="first">A</forename><surname>Mallen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Asai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Zhong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Das</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Khashabi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Hajishirzi</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2023.acl-long.546</idno>
		<ptr target="https://doi.org/10.18653/v1/2023.acl-long.546.doi:10.18653/v1/2023.acl-long.546" />
	</analytic>
	<monogr>
		<title level="m">ACL 2023, ACL</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Rogers</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><forename type="middle">L</forename><surname>Boyd-Graber</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Okazaki</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="9802" to="9822" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<title level="m" type="main">Internet-augmented language models through few-shot prompting for open-domain question answering</title>
		<author>
			<persName><forename type="first">A</forename><surname>Lazaridou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Gribovskaya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Stokowiec</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Grigorev</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2203.05115</idno>
		<idno type="arXiv">arXiv:2203.05115</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2203.05115.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<monogr>
		<title level="m" type="main">REPLUG: retrieval-augmented black-box language models</title>
		<author>
			<persName><forename type="first">W</forename><surname>Shi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Min</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Yasunaga</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Seo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>James</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lewis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Yih</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2301.12652</idno>
		<idno type="arXiv">arXiv:2301.12652</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2301.12652.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">A survey of knowledge-enhanced text generation</title>
		<author>
			<persName><forename type="first">W</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Zhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Ji</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Jiang</surname></persName>
		</author>
		<idno type="DOI">10.1145/3512467</idno>
		<ptr target="https://doi.org/10.1145/3512467.doi:10.1145/3512467" />
	</analytic>
	<monogr>
		<title level="j">ACM Comput. Surv</title>
		<imprint>
			<biblScope unit="volume">54</biblScope>
			<biblScope unit="page">38</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<monogr>
		<author>
			<persName><forename type="first">D</forename><surname>Dash</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Thapa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Banda</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Swaminathan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Cheatham</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kashyap</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Kotecha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">H</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gombar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Downing</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Pedreira</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Goh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Arnaout</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">K</forename><surname>Morris</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Magon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">P</forename><surname>Lungren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Horvitz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">H</forename><surname>Shah</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2304.13714</idno>
		<idno type="arXiv">arXiv:2304.13714</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2304.13714.doi:10.48550/arXiv" />
		<title level="m">Evaluation of GPT-3.5 and GPT-4 for supporting real-world information needs in healthcare delivery</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<title level="m" type="main">Med-halt: Medical domain hallucination test for large language models</title>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">K</forename><surname>Umapathi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Pal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Sankarasubbu</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2307.15343</idno>
		<idno type="arXiv">arXiv:2307.15343</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2307.15343" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<title level="m" type="main">Transformative effects of chatgpt on modern education: Emerging era of AI chatbots</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">S</forename><surname>Gill</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Patros</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Kaur</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Kaur</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Fuller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Singh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Arora</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">K</forename><surname>Parlikad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Stankovski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Abraham</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">K</forename><surname>Ghosh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Lutfiyya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">S</forename><surname>Kanhere</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Bahsoon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">F</forename><surname>Rana</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Dustdar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Sakellariou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Uhlig</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Buyya</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2306.03823</idno>
		<idno type="arXiv">arXiv:2306.03823</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2306.03823.doi:10.48550/arXiv." />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<monogr>
		<title level="m" type="main">Hallucination is the last thing you need</title>
		<author>
			<persName><forename type="first">S</forename><surname>Curran</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Lansley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Bethell</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2306.11520</idno>
		<idno type="arXiv">arXiv:2306.11520</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2306.11520" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Survey of hallucination in natural language generation</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Ji</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Frieske</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Su</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Ishii</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Madotto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Fung</surname></persName>
		</author>
		<idno type="DOI">10.1145/3571730</idno>
		<ptr target="https://doi.org/10.1145/3571730.doi:10.1145/3571730" />
	</analytic>
	<monogr>
		<title level="j">ACM Comput. Surv</title>
		<imprint>
			<biblScope unit="volume">55</biblScope>
			<biblScope unit="page">38</biblScope>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">On faithfulness and factuality in abstractive summarization</title>
		<author>
			<persName><forename type="first">J</forename><surname>Maynez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Narayan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Bohnet</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">T</forename><surname>Mcdonald</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2020.acl-main.173</idno>
		<ptr target="https://doi.org/10.18653/v1/2020.acl-main.173.doi:10.18653/v1/2020.acl-main.173" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020</title>
				<editor>
			<persName><forename type="first">D</forename><surname>Jurafsky</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Chai</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Schluter</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><forename type="middle">R</forename><surname>Tetreault</surname></persName>
		</editor>
		<meeting>the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020</meeting>
		<imprint>
			<publisher>ACL</publisher>
			<date type="published" when="2020">July 5-10, 2020. 2020</date>
			<biblScope unit="page" from="1906" to="1919" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<monogr>
		<title level="m" type="main">How language model hallucinations can snowball</title>
		<author>
			<persName><forename type="first">M</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Press</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Merrill</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">A</forename><surname>Smith</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.13534</idno>
		<idno type="arXiv">arXiv:2305.13534</idno>
		<ptr target="/" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<monogr>
		<author>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Zhong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Mi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zeng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Shang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Liu</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2307.12966</idno>
		<idno type="arXiv">arXiv:2307.12966</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2307.12966.doi:10.48550/arXiv" />
		<title level="m">Aligning large language models with human: A survey</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<monogr>
		<title level="m" type="main">Automatically correcting large language models: Surveying the landscape of diverse self-correction strategies</title>
		<author>
			<persName><forename type="first">L</forename><surname>Pan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Saxon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Nathani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">Y</forename><surname>Wang</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2308.03188</idno>
		<idno type="arXiv">arXiv:2308.03188</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2308.03188" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">Truthfulqa: Measuring how models mimic human falsehoods</title>
		<author>
			<persName><forename type="first">S</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hilton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Evans</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2022.acl-long.229</idno>
		<ptr target="https://doi.org/10.18653/v1/2022.acl-long.229.doi:10.18653/v1/2022.acl-long.229" />
	</analytic>
	<monogr>
		<title level="m">ACL 2022, ACL</title>
				<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="3214" to="3252" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<monogr>
		<title level="m" type="main">Halueval: A large-scale hallucination evaluation benchmark for large language models</title>
		<author>
			<persName><forename type="first">J</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Cheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">X</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Nie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wen</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.11747</idno>
		<idno type="arXiv">arXiv:2305.11747</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.11747.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<monogr>
		<title level="m" type="main">Text2kgbench: A benchmark for ontologydriven knowledge graph generation from text</title>
		<author>
			<persName><forename type="first">N</forename><surname>Mihindukulasooriya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Tiwari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">F</forename><surname>Enguix</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lata</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2308.02357</idno>
		<idno type="arXiv">arXiv:2308.02357</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2308.02357.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<analytic>
		<title level="a" type="main">Did you read the instructions? rethinking the effectiveness of task definitions in instruction learning</title>
		<author>
			<persName><forename type="first">F</forename><surname>Yin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Vig</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Laban</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Joty</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Xiong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Wu</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2023.acl-long.172</idno>
		<ptr target="https://doi.org/10.18653/v1/2023.acl-long.172.doi:10.18653/v1/2023.acl-long.172" />
	</analytic>
	<monogr>
		<title level="m">ACL 2023, ACL</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Rogers</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><forename type="middle">L</forename><surname>Boyd-Graber</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Okazaki</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="3063" to="3079" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b33">
	<analytic>
		<title level="a" type="main">Improving in-context few-shot learning via self-supervised training</title>
		<author>
			<persName><forename type="first">M</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Du</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Pasunuru</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Mihaylov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Iyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Stoyanov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Kozareva</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2022.naacl-main.260</idno>
		<ptr target="https://doi.org/10.18653/v1/2022.naacl-main.260.doi:10.18653/v1/2022.naacl-main.260" />
	</analytic>
	<monogr>
		<title level="m">NAACL 2022, ACL</title>
				<editor>
			<persName><forename type="first">M</forename><surname>Carpuat</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>De Marneffe</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">I</forename><forename type="middle">V M</forename><surname>Ruíz</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="3558" to="3573" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b34">
	<monogr>
		<title level="m" type="main">Sources of hallucination by large language models on inference tasks</title>
		<author>
			<persName><forename type="first">N</forename><surname>Mckenna</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Cheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J</forename><surname>Hosseini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Johnson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Steedman</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.14552</idno>
		<idno type="arXiv">arXiv:2305.14552</idno>
		<ptr target="/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b35">
	<monogr>
		<title level="m" type="main">Data distributional properties drive emergent in-context learning in transformers</title>
		<author>
			<persName><forename type="first">S</forename><surname>Chan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Santoro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">K</forename><surname>Lampinen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Singh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">H</forename><surname>Richemond</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">L</forename><surname>Mc-Clelland</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Hill</surname></persName>
		</author>
		<ptr target="http://papers.nips.cc/paper_files/paper/2022/hash/77c6ccacfd9962e2307fc64680fc5ace-Abstract-Conference.html" />
		<imprint>
			<date type="published" when="2022">2022</date>
			<publisher>NeurIPS</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b36">
	<analytic>
		<title level="a" type="main">Let me check the examples: Enhancing demonstration learning via explicit imitation</title>
		<author>
			<persName><forename type="first">S</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Wei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Wu</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2023.acl-short.93</idno>
		<ptr target="https://doi.org/10.18653/v1/2023.acl-short.93.doi:10.18653/v1/2023.acl-short.93" />
	</analytic>
	<monogr>
		<title level="m">ACL 2023, ACL</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Rogers</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><forename type="middle">L</forename><surname>Boyd-Graber</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Okazaki</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="1080" to="1088" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b37">
	<analytic>
		<title level="a" type="main">Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Bartolo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Moore</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Riedel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Stenetorp</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2022.acl-long.556</idno>
		<ptr target="https://doi.org/10.18653/v1/2022.acl-long.556.doi:10.18653/v1/2022.acl-long.556" />
	</analytic>
	<monogr>
		<title level="m">ACL 2022, ACL</title>
				<editor>
			<persName><forename type="first">S</forename><surname>Muresan</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Nakov</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Villavicencio</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="8086" to="8098" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b38">
	<monogr>
		<title level="m" type="main">Hallucinations in large multilingual translation models</title>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">M</forename><surname>Guerreiro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">M</forename><surname>Alves</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Waldendorf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Haddow</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Birch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Colombo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">F T</forename><surname>Martins</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2303.16104</idno>
		<idno type="arXiv">arXiv:2303.16104</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2303.16104" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b39">
	<monogr>
		<title level="m" type="main">Visual instruction tuning</title>
		<author>
			<persName><forename type="first">H</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><forename type="middle">J</forename><surname>Lee</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2304.08485</idno>
		<idno type="arXiv">arXiv:2304.08485</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2304.08485" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b40">
	<monogr>
		<title level="m" type="main">Evaluating object hallucination in large visionlanguage models</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Du</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">X</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wen</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.10355</idno>
		<idno type="arXiv">arXiv:2305.10355</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.10355" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b41">
	<monogr>
		<title level="m" type="main">Why does chatgpt fall short in answering questions faithfully?</title>
		<author>
			<persName><forename type="first">S</forename><surname>Zheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">C</forename><surname>Chang</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2304.10513</idno>
		<idno type="arXiv">arXiv:2304.10513</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2304.10513.doi:10.48550/" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b42">
	<analytic>
		<title level="a" type="main">Zero-shot faithful factual error correction</title>
		<author>
			<persName><forename type="first">K</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">P</forename><surname>Chan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Ji</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2023.acl-long.311</idno>
		<ptr target="https://doi.org/10.18653/v1/2023.acl-long.311.doi:10.18653/v1/2023.acl-long.311" />
	</analytic>
	<monogr>
		<title level="m">ACL 2023, ACL</title>
				<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="5660" to="5676" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b43">
	<analytic>
		<title level="a" type="main">RARR: researching and revising what language models say, using language models</title>
		<author>
			<persName><forename type="first">L</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Dai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Pasupat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">T</forename><surname>Chaganty</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Fan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">Y</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Lao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Juan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Guu</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2023.acl-long.910</idno>
		<ptr target="https://doi.org/10.18653/v1/2023.acl-long.910.doi:10.18653/v1/2023.acl-long.910" />
	</analytic>
	<monogr>
		<title level="m">ACL 2023, ACL</title>
				<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="16477" to="16508" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b44">
	<monogr>
		<title level="m" type="main">Mitigating language model hallucination with interactive question-knowledge alignment</title>
		<author>
			<persName><forename type="first">S</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Pan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">Y</forename><surname>Wang</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.13669</idno>
		<idno type="arXiv">arXiv:2305.13669</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.13669" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b45">
	<monogr>
		<author>
			<persName><forename type="first">D</forename><surname>Halawi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Denain</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Steinhardt</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2307.09476</idno>
		<idno type="arXiv">arXiv:2307.09476</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2307.09476.doi:10.48550/arXiv" />
		<title level="m">Overthinking the truth: Understanding how language models process false demonstrations</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b46">
	<monogr>
		<title level="m" type="main">Histalign: Improving context dependency in language generation by aligning with history</title>
		<author>
			<persName><forename type="first">D</forename><surname>Wan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Bansal</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.04782</idno>
		<idno type="arXiv">arXiv:2305.04782</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.04782" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b47">
	<analytic>
		<title level="a" type="main">The dangers of trusting stochastic parrots: Faithfulness and trust in open-domain conversational question answering</title>
		<author>
			<persName><forename type="first">S</forename><surname>Chiesurin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Dimakopoulos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A S</forename><surname>Cabezudo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Eshghi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Papaioannou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Rieser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Konstas</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2023.findings-acl.60</idno>
		<ptr target="https://doi.org/10.18653/v1/2023.findings-acl.60.doi:10.18653/v1/2023.findings-acl.60" />
	</analytic>
	<monogr>
		<title level="m">Findings of ACL 2023, ACL</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Rogers</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><forename type="middle">L</forename><surname>Boyd-Graber</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Okazaki</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="947" to="959" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b48">
	<analytic>
		<title level="a" type="main">Improved natural language generation via loss truncation</title>
		<author>
			<persName><forename type="first">D</forename><surname>Kang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Hashimoto</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2020.acl-main.66</idno>
		<ptr target="https://doi.org/10.18653/v1/2020.acl-main.66.doi:10.18653/v1/2020.acl-main.66" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020</title>
				<editor>
			<persName><forename type="first">D</forename><surname>Jurafsky</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Chai</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Schluter</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><forename type="middle">R</forename><surname>Tetreault</surname></persName>
		</editor>
		<meeting>the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020</meeting>
		<imprint>
			<publisher>ACL</publisher>
			<date type="published" when="2020">July 5-10, 2020. 2020</date>
			<biblScope unit="page" from="718" to="731" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b49">
	<analytic>
		<title level="a" type="main">On exposure bias, hallucination and domain shift in neural machine translation</title>
		<author>
			<persName><forename type="first">C</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Sennrich</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2020.acl-main.326</idno>
		<ptr target="https://doi.org/10.18653/v1/2020.acl-main.326.doi:10.18653/v1/2020.acl-main.326" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020</title>
				<editor>
			<persName><forename type="first">D</forename><surname>Jurafsky</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Chai</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Schluter</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><forename type="middle">R</forename><surname>Tetreault</surname></persName>
		</editor>
		<meeting>the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020</meeting>
		<imprint>
			<publisher>ACL</publisher>
			<date type="published" when="2020">July 5-10, 2020. 2020</date>
			<biblScope unit="page" from="3544" to="3552" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b50">
	<monogr>
		<title level="m" type="main">Factuality enhanced language models for open-ended text generation</title>
		<author>
			<persName><forename type="first">N</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Ping</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Patwary</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Fung</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Shoeybi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Catanzaro</surname></persName>
		</author>
		<ptr target="http://papers.nips.cc/paper_files/paper/2022/hash/df438caa36714f69277daa92d608dd63-Abstract-Conference.html" />
		<imprint>
			<date type="published" when="2022">2022</date>
			<publisher>NeurIPS</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b51">
	<analytic>
		<title level="a" type="main">The curious case of hallucinations in neural machine translation</title>
		<author>
			<persName><forename type="first">V</forename><surname>Raunak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Menezes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Junczys-Dowmunt</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">NAACL 2021, ACL</title>
				<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="1172" to="1183" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b52">
	<analytic>
		<title level="a" type="main">Looking for a needle in a haystack: A comprehensive study of hallucinations in neural machine translation</title>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">M</forename><surname>Guerreiro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Voita</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">F T</forename><surname>Martins</surname></persName>
		</author>
		<ptr target="https://aclanthology.org/2023.eacl-main.75" />
	</analytic>
	<monogr>
		<title level="m">EACL 2023, ACL</title>
				<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="1059" to="1075" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b53">
	<monogr>
		<title level="m" type="main">Halomi: A manually annotated benchmark for multilingual hallucination and omission detection in machine translation</title>
		<author>
			<persName><forename type="first">D</forename><surname>Dale</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Voita</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Hansanti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Ropers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Kalbassi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Barrault</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">R</forename><surname>Costa-Jussà</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.11746</idno>
		<idno type="arXiv">arXiv:2305.11746</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.11746" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b54">
	<monogr>
		<title level="m" type="main">mmt5: Modular multilingual pre-training solves source language hallucinations</title>
		<author>
			<persName><forename type="first">J</forename><surname>Pfeiffer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Piccinno</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Nicosia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Reid</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ruder</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.14224</idno>
		<idno type="arXiv">arXiv:2305.14224</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.14224.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b55">
	<monogr>
		<title level="m" type="main">Evaluating correctness and faithfulness of instruction-following models for question answering</title>
		<author>
			<persName><forename type="first">V</forename><surname>Adlakha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Behnamghader</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><forename type="middle">H</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Meade</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Reddy</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2307.16877</idno>
		<idno type="arXiv">arXiv:2307.16877</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2307.16877" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b56">
	<analytic>
		<title level="a" type="main">On the origin of hallucinations in conversational models: Is it the datasets or the models?</title>
		<author>
			<persName><forename type="first">N</forename><surname>Dziri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Milton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">R</forename><surname>Zaïane</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Reddy</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2022.naacl-main.387</idno>
		<ptr target="https://doi.org/10.18653/v1/2022.naacl-main.387.doi:10.18653/v1/2022.naacl-main.387" />
	</analytic>
	<monogr>
		<title level="m">NAACL 2022, ACL</title>
				<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="5271" to="5285" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b57">
	<analytic>
		<title level="a" type="main">Diving deep into modes of fact hallucinations in dialogue systems</title>
		<author>
			<persName><forename type="first">S</forename><surname>Das</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Saha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">K</forename><surname>Srihari</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2022.findings-emnlp.48</idno>
		<ptr target="https://doi.org/10.18653/v1/2022.findings-emnlp.48.doi:10.18653/v1/2022.findings-emnlp.48" />
	</analytic>
	<monogr>
		<title level="m">Findings of EMNLP 2022, ACL</title>
				<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="684" to="699" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b58">
	<analytic>
		<title level="a" type="main">Faithdial: A faithful benchmark for information-seeking dialogue</title>
		<author>
			<persName><forename type="first">N</forename><surname>Dziri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Kamalloo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Milton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">R</forename><surname>Zaïane</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">M</forename><surname>Ponti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Reddy</surname></persName>
		</author>
		<ptr target="https://transacl.org/ojs/index.php/tacl/article/view/4113" />
	</analytic>
	<monogr>
		<title level="j">Trans. Assoc. Comput. Linguistics</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="page" from="1473" to="1490" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b59">
	<analytic>
		<title level="a" type="main">Evaluating attribution in dialogue systems: The BEGIN benchmark</title>
		<author>
			<persName><forename type="first">N</forename><surname>Dziri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Rashkin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Linzen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Reitter</surname></persName>
		</author>
		<ptr target="https://transacl.org/ojs/index.php/tacl/article/view/3977" />
	</analytic>
	<monogr>
		<title level="j">Trans. Assoc. Comput. Linguistics</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="page" from="1066" to="1083" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b60">
	<analytic>
		<title level="a" type="main">Contrastive learning reduces hallucination in conversations</title>
		<author>
			<persName><forename type="first">W</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Shi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Ren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>De Rijke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Ren</surname></persName>
		</author>
		<ptr target="https://ojs.aaai.org/index.php/AAAI/article/view/26596" />
	</analytic>
	<monogr>
		<title level="m">AAAI 2023</title>
				<imprint>
			<publisher>AAAI Press</publisher>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="13618" to="13626" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b61">
	<analytic>
		<title level="a" type="main">Evaluating the factual consistency of large language models through news summarization</title>
		<author>
			<persName><forename type="first">D</forename><surname>Tam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mascarenhas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kwan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Bansal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Raffel</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2023.findings-acl.322</idno>
		<ptr target="https://doi.org/10.18653/v1/2023.findings-acl.322.doi:10.18653/v1/2023.findings-acl.322" />
	</analytic>
	<monogr>
		<title level="m">Findings of ACL 2023, ACL</title>
				<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="5220" to="5255" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b62">
	<analytic>
		<title level="a" type="main">Hallucinated but factual! inspecting the factuality of hallucinations in abstractive summarization</title>
		<author>
			<persName><forename type="first">M</forename><surname>Cao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Dong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">C K</forename><surname>Cheung</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2022.acl-long.236</idno>
		<ptr target="https://doi.org/10.18653/v1/2022.acl-long.236.doi:10.18653/v1/2022.acl-long.236" />
	</analytic>
	<monogr>
		<title level="m">ACL 2022, ACL</title>
				<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="3340" to="3354" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b63">
	<analytic>
		<title level="a" type="main">why is this misleading?&quot;: Detecting news headline hallucinations with explanations</title>
		<author>
			<persName><forename type="first">J</forename><surname>Shen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Finnie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Rahmati</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Bendersky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Najork</surname></persName>
		</author>
		<idno type="DOI">10.1145/3543507.3583375</idno>
		<idno>doi:10.1145/3543507.3583375</idno>
		<ptr target="https://doi.org/10.1145/3543507.3583375" />
	</analytic>
	<monogr>
		<title level="m">WWW 2023</title>
				<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="1662" to="1672" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b64">
	<monogr>
		<title level="m" type="main">Detecting and mitigating hallucinations in multilingual summarisation</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Qiu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Ziser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Korhonen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">M</forename><surname>Ponti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">B</forename><surname>Cohen</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.13632</idno>
		<idno type="arXiv">arXiv:2305.13632</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.13632" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b65">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Tu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Cao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Zhang-Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Lv</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Peng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Yao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Xin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Yun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Gong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Qi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Guan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Zeng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Qi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Jin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Gu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Ding</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Hou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Li</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2306.09296</idno>
		<idno type="arXiv">arXiv:2306.09296</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2306.09296.doi:10.48550/arXiv.2306.09296" />
		<title level="m">Kola: Carefully benchmarking world knowledge of large language models</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b66">
	<monogr>
		<title level="m" type="main">Aligning large multi-modal model with robust instruction tuning</title>
		<author>
			<persName><forename type="first">F</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yacoob</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Wang</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2306.14565</idno>
		<idno type="arXiv">arXiv:2306.14565</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2306.14565" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b67">
	<monogr>
		<author>
			<persName><forename type="first">R</forename><surname>Mahmood</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">K</forename><surname>Kalra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Yan</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2307.14634</idno>
		<idno type="arXiv">arXiv:2307.14634</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2307.14634" />
		<title level="m">Fact-checking of ai-generated reports</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b68">
	<monogr>
		<title level="m" type="main">Chain of natural language inference for reducing large language model ungrounded hallucinations</title>
		<author>
			<persName><forename type="first">D</forename><surname>Lei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Yun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Ching</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Kamal</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2310.03951</idno>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b69">
	<analytic>
		<title level="a" type="main">Bartscore: Evaluating generated text as text generation</title>
		<author>
			<persName><forename type="first">W</forename><surname>Yuan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Neubig</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Liu</surname></persName>
		</author>
		<ptr target="https://proceedings.neurips.cc/paper/2021/hash/e4d2b6e6fdeca3e60e0f1a62fee3d9dd-Abstract.html" />
	</analytic>
	<monogr>
		<title level="j">NeurIPS</title>
		<editor>M. Ranzato, A. Beygelzimer, Y. N. Dauphin, P. Liang, J. W. Vaughan</editor>
		<imprint>
			<biblScope unit="volume">2021</biblScope>
			<biblScope unit="page" from="27263" to="27277" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b70">
	<monogr>
		<title level="m" type="main">Knowledge of knowledge: Exploring knownunknowns uncertainty with large language models</title>
		<author>
			<persName><forename type="first">A</forename><surname>Amayuelas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Pan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">Y</forename><surname>Wang</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.13712</idno>
		<idno type="arXiv">arXiv:2305.13712</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.13712" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b71">
	<analytic>
		<title level="a" type="main">Methods for measuring, updating, and visualizing factual beliefs in language models</title>
		<author>
			<persName><forename type="first">P</forename><surname>Hase</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">T</forename><surname>Diab</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Celikyilmaz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Kozareva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Stoyanov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Bansal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Iyer</surname></persName>
		</author>
		<ptr target="https://aclanthology.org/2023.eacl-main.199" />
	</analytic>
	<monogr>
		<title level="m">EACL 2023, ACL</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Vlachos</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">I</forename><surname>Augenstein</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="2706" to="2723" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b72">
	<monogr>
		<title level="m" type="main">Measuring and modifying factual knowledge in large language models</title>
		<author>
			<persName><forename type="first">P</forename><surname>Pezeshkpour</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2306.06264</idno>
		<idno type="arXiv">arXiv:2306.06264</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2306.06264.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b73">
	<monogr>
		<author>
			<persName><forename type="first">T</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">S</forename><surname>Preston</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Poon</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2306.16564</idno>
		<title level="m">Llm calibration and automatic hallucination detection via pareto optimal self-supervision</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b74">
	<monogr>
		<title level="m" type="main">A stitch in time saves nine: Detecting and mitigating hallucinations of llms by validating low-confidence generation</title>
		<author>
			<persName><forename type="first">N</forename><surname>Varshney</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Yao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Yu</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2307.03987</idno>
		<idno type="arXiv">arXiv:2307.03987</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2307.03987.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b75">
	<monogr>
		<title level="m" type="main">Language models (mostly) know what they know</title>
		<author>
			<persName><forename type="first">S</forename><surname>Kadavath</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Conerly</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Askell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Henighan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Drain</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Perez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Schiefer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Hatfield-Dodds</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Dassarma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Tran-Johnson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Johnston</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">E</forename><surname>Showk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Jones</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Elhage</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Hume</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bowman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Fort</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Ganguli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Hernandez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Jacobson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kernion</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kravec</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Lovitt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Ndousse</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Olsson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ringer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Amodei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Brown</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Clark</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Joseph</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Mann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mccandlish</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Olah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kaplan</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2207.05221</idno>
		<idno type="arXiv">arXiv:2207.05221</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2207.05221.doi:10.48550/arXiv." />
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b76">
	<monogr>
		<title level="m" type="main">Selfcheckgpt: Zero-resource black-box hallucination detection for generative large language models</title>
		<author>
			<persName><forename type="first">P</forename><surname>Manakul</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Liusie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J F</forename><surname>Gales</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2303.08896</idno>
		<idno type="arXiv">arXiv:2303.08896</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2303.08896.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b77">
	<monogr>
		<title level="m" type="main">Do language models know when they&apos;re hallucinating references?</title>
		<author>
			<persName><forename type="first">A</forename><surname>Agrawal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Mackey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">T</forename><surname>Kalai</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.18248</idno>
		<idno type="arXiv">arXiv:2305.18248</idno>
		<ptr target="/" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b78">
	<monogr>
		<title level="m" type="main">Self-checker: Plug-and-play modules for fact-checking with large language models</title>
		<author>
			<persName><forename type="first">M</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Peng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhang</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.14623</idno>
		<idno type="arXiv">arXiv:2305.14623</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.14623" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b79">
	<monogr>
		<title level="m" type="main">detecting factual errors via cross examination</title>
		<author>
			<persName><forename type="first">R</forename><surname>Cohen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hamri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Geva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Globerson</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.13281</idno>
		<idno type="arXiv">arXiv:2305.13281</idno>
		<ptr target="/" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b80">
	<monogr>
		<title level="m" type="main">Self-contradictory hallucinations of large language models: Evaluation, detection and mitigation</title>
		<author>
			<persName><forename type="first">N</forename><surname>Mündler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Jenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">T</forename><surname>Vechev</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.15852</idno>
		<idno type="arXiv">arXiv:2305.15852</idno>
		<ptr target="/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b81">
	<monogr>
		<title level="m" type="main">A new benchmark and reverse validation method for passage-level hallucination detection</title>
		<author>
			<persName><forename type="first">S</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wan</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2310.06498</idno>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b82">
	<monogr>
		<title level="m" type="main">Factscore: Fine-grained atomic evaluation of factual precision in long form text generation</title>
		<author>
			<persName><forename type="first">S</forename><surname>Min</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Krishna</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Lyu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lewis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Yih</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">W</forename><surname>Koh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Iyyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Hajishirzi</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.14251</idno>
		<idno type="arXiv">arXiv:2305.14251</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.14251" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b83">
	<monogr>
		<title level="m" type="main">Complex claim verification with evidence retrieved in the wild</title>
		<author>
			<persName><forename type="first">J</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sriram</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Durrett</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Choi</surname></persName>
		</author>
		<idno type="DOI">10.48550/ARXIV.2305.11859</idno>
		<idno type="arXiv">arXiv:2305.11859</idno>
		<ptr target="/ARXIV.2305.11859" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b84">
	<monogr>
		<title level="m" type="main">Retrieving supporting evidence for llms generated answers</title>
		<author>
			<persName><forename type="first">S</forename><surname>Huo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Arabzadeh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">L A</forename><surname>Clarke</surname></persName>
		</author>
		<idno type="DOI">10.48550/ARXIV.2306.13781</idno>
		<idno type="arXiv">arXiv:2306.13781</idno>
		<ptr target="/ARXIV.2306.13781" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b85">
	<monogr>
		<author>
			<persName><forename type="first">I</forename><surname>Chern</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Chern</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Yuan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Feng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Neubig</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Liu</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2307.13528</idno>
		<idno type="arXiv">arXiv:2307.13528</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2307.13528" />
		<title level="m">Factool: Factuality detection in generative AI -A tool augmented framework for multi-task and multidomain scenarios</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b86">
	<monogr>
		<title level="m" type="main">Investigating the translation performance of a large multilingual language model: the case of BLOOM</title>
		<author>
			<persName><forename type="first">R</forename><surname>Bawden</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Yvon</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2303.01911</idno>
		<idno type="arXiv">arXiv:2303.01911</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2303.01911.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b87">
	<monogr>
		<title level="m" type="main">How good are GPT models at machine translation? A comprehensive evaluation</title>
		<author>
			<persName><forename type="first">A</forename><surname>Hendy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Abdelrehim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sharaf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Raunak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gabr</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Matsushita</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><forename type="middle">J</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Afify</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">H</forename><surname>Awadalla</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2302.09210</idno>
		<idno type="arXiv">arXiv:2302.09210</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2302.09210.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b88">
	<analytic>
		<title level="a" type="main">Unsupervised cross-lingual representation learning at scale</title>
		<author>
			<persName><forename type="first">A</forename><surname>Conneau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Khandelwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Chaudhary</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Wenzek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Guzmán</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Grave</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ott</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Stoyanov</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2020.acl-main.747</idno>
		<ptr target="https://doi.org/10.18653/v1/2020.acl-main.747.doi:10.18653/v1/2020.acl-main.747" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020</title>
				<editor>
			<persName><forename type="first">D</forename><surname>Jurafsky</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Chai</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Schluter</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><forename type="middle">R</forename><surname>Tetreault</surname></persName>
		</editor>
		<meeting>the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020</meeting>
		<imprint>
			<publisher>ACL</publisher>
			<date type="published" when="2020">July 5-10, 2020. 2020</date>
			<biblScope unit="page" from="8440" to="8451" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b89">
	<monogr>
		<title level="m" type="main">Evaluating generative models for graph-to-text generation</title>
		<author>
			<persName><forename type="first">S</forename><surname>Yuan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Färber</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2307.14712</idno>
		<idno type="arXiv">arXiv:2307.14712</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2307.14712" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b90">
	<monogr>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Qiao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Ou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Deng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Zhang</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.13168</idno>
		<idno type="arXiv">arXiv:2305.13168</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.13168" />
		<title level="m">Llms for knowledge graph construction and reasoning: Recent capabilities and future opportunities</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b91">
	<monogr>
		<title level="m" type="main">Minigpt-4: Enhancing vision-language understanding with advanced large language models</title>
		<author>
			<persName><forename type="first">D</forename><surname>Zhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Shen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Elhoseiny</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2304.10592</idno>
		<idno type="arXiv">arXiv:2304.10592</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2304.10592.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b92">
	<analytic>
		<title level="a" type="main">OFA: unifying architectures, tasks, and modalities through a simple sequence-to-sequence learning framework</title>
		<author>
			<persName><forename type="first">P</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Men</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Yang</surname></persName>
		</author>
		<ptr target="https://proceedings.mlr.press/v162/wang22al.html" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of Machine Learning Research</title>
				<editor>
			<persName><forename type="first">K</forename><surname>Chaudhuri</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Jegelka</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Song</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Szepesvári</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><surname>Niu</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Sabato</surname></persName>
		</editor>
		<meeting>Machine Learning Research<address><addrLine>PMLR</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="volume">162</biblScope>
			<biblScope unit="page" from="23318" to="23340" />
		</imprint>
	</monogr>
	<note>ICML 2022</note>
</biblStruct>

<biblStruct xml:id="b93">
	<analytic>
		<title level="a" type="main">Let there be a clock on the beach: Reducing object hallucination in image captioning</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">F</forename><surname>Biten</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Gómez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Karatzas</surname></persName>
		</author>
		<idno type="DOI">10.1109/WACV51458.2022.00253</idno>
		<ptr target="https://doi.org/10.1109/WACV51458.2022.00253.doi:10.1109/WACV51458.2022.00253" />
	</analytic>
	<monogr>
		<title level="m">IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2022</title>
				<meeting><address><addrLine>Waikoloa, HI, USA</addrLine></address></meeting>
		<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2022">January 3-8, 2022. 2022</date>
			<biblScope unit="page" from="2473" to="2482" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b94">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Petryk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Whitehead</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">E</forename><surname>Gonzalez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Darrell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rohrbach</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Rohrbach</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.07021</idno>
		<idno type="arXiv">arXiv:2305.07021</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.07021.doi:10.48550/arXiv" />
		<title level="m">Simple token-level confidence improves caption correctness</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b95">
	<monogr>
		<title level="m" type="main">Album storytelling with iterative story-aware captioning and large language models</title>
		<author>
			<persName><forename type="first">M</forename><surname>Ning</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Xie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Song</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Yuan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Ye</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Yuan</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.12943</idno>
		<idno type="arXiv">arXiv:2305.12943</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.12943.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b96">
	<analytic>
		<title level="a" type="main">Understanding factuality in abstractive summarization with FRANK: A benchmark for factuality metrics</title>
		<author>
			<persName><forename type="first">A</forename><surname>Pagnoni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Balachandran</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tsvetkov</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2021.naacl-main.383</idno>
		<ptr target="https://doi.org/10.18653/v1/2021.naacl-main.383.doi:10.18653/v1/2021.naacl-main.383" />
	</analytic>
	<monogr>
		<title level="m">NAACL 2021, ACL</title>
				<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="4812" to="4829" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b97">
	<analytic>
		<title level="a" type="main">Knowledge-grounded dialogue generation with a unified knowledge representation</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Peng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Shen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Mao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Liden</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gao</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2022.naacl-main.15</idno>
		<ptr target="https://doi.org/10.18653/v1/2022.naacl-main.15.doi:10.18653/v1/2022.naacl-main.15" />
	</analytic>
	<monogr>
		<title level="m">NAACL 2022, ACL</title>
				<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="206" to="218" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b98">
	<analytic>
		<title level="a" type="main">Faithful to the document or to the world? mitigating hallucinations via entity-linked knowledge in abstractive summarization</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Dong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wieting</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Verga</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2022.findings-emnlp.76</idno>
		<ptr target="https://doi.org/10.18653/v1/2022.findings-emnlp.76.doi:10.18653/v1/2022.findings-emnlp.76" />
	</analytic>
	<monogr>
		<title level="m">Findings of EMNLP 2022, ACL</title>
				<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="1067" to="1082" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b99">
	<analytic>
		<title level="a" type="main">FEQA: A question answering evaluation framework for faithfulness assessment in abstractive summarization</title>
		<author>
			<persName><forename type="first">E</forename><surname>Durmus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">T</forename><surname>Diab</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2020.acl-main.454</idno>
		<ptr target="https://doi.org/10.18653/v1/2020.acl-main.454.doi:10.18653/v1/2020.acl-main.454" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020</title>
				<editor>
			<persName><forename type="first">D</forename><surname>Jurafsky</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Chai</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Schluter</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><forename type="middle">R</forename><surname>Tetreault</surname></persName>
		</editor>
		<meeting>the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020</meeting>
		<imprint>
			<publisher>ACL</publisher>
			<date type="published" when="2020">July 5-10, 2020. 2020</date>
			<biblScope unit="page" from="5055" to="5070" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b100">
	<analytic>
		<title level="a" type="main">Measuring sentence-level and aspect-level (un)certainty in science communications</title>
		<author>
			<persName><forename type="first">J</forename><surname>Pei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Jurgens</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2021.emnlp-main.784</idno>
		<ptr target="https://doi.org/10.18653/v1/2021.emnlp-main.784.doi:10.18653/v1/2021.emnlp-main.784" />
	</analytic>
	<monogr>
		<title level="m">EMNLP 2021, ACL</title>
				<editor>
			<persName><forename type="first">M</forename><surname>Moens</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">X</forename><surname>Huang</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Specia</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><forename type="middle">W</forename><surname>Yih</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="9959" to="10011" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b101">
	<analytic>
		<title level="a" type="main">Editing models with task arithmetic</title>
		<author>
			<persName><forename type="first">G</forename><surname>Ilharco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">T</forename><surname>Ribeiro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wortsman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Schmidt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Hajishirzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Farhadi</surname></persName>
		</author>
		<ptr target="https://openreview.net/pdf?id=6t0Kwf8-jrj" />
	</analytic>
	<monogr>
		<title level="m">The Eleventh International Conference on Learning Representations, ICLR 2023</title>
				<meeting><address><addrLine>Kigali, Rwanda</addrLine></address></meeting>
		<imprint>
			<publisher>OpenReview</publisher>
			<date type="published" when="2023">May 1-5, 2023. 2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b102">
	<monogr>
		<title level="m" type="main">Elastic weight removal for faithful and abstractive dialogue generation</title>
		<author>
			<persName><forename type="first">N</forename><surname>Daheim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Dziri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Sachan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Gurevych</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">M</forename><surname>Ponti</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2303.17574</idno>
		<idno type="arXiv">arXiv:2303.17574</idno>
		<ptr target="/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b103">
	<monogr>
		<title level="m" type="main">PURR: efficiently editing language model hallucinations by denoising language model corruptions</title>
		<author>
			<persName><forename type="first">A</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Pasupat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Singh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Guu</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.14908</idno>
		<idno type="arXiv">arXiv:2305.14908</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.14908.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b104">
	<monogr>
		<title level="m" type="main">Trusting your evidence: Hallucinate less with context-aware decoding</title>
		<author>
			<persName><forename type="first">W</forename><surname>Shi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Han</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lewis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tsvetkov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">W</forename><surname>Yih</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.14739</idno>
		<idno type="arXiv">arXiv:2305.14739</idno>
		<ptr target="/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b105">
	<monogr>
		<title level="m" type="main">Augmented large language models with parametric knowledge guiding</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Luo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Geng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Tao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Jiang</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.04757</idno>
		<idno type="arXiv">arXiv:2305.04757</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.04757.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b106">
	<monogr>
		<title level="m" type="main">TRAC: trustworthy retrieval augmented chatbot</title>
		<author>
			<persName><forename type="first">S</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Park</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Bastani</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2307.04642</idno>
		<idno type="arXiv">arXiv:2307.04642</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2307.04642" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b107">
	<monogr>
		<title level="m" type="main">Inference-time intervention: Eliciting truthful answers from a language model</title>
		<author>
			<persName><forename type="first">K</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Patel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">B</forename><surname>Viégas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Pfister</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wattenberg</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2306.03341</idno>
		<idno type="arXiv">arXiv:2306.03341</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2306.03341.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b108">
	<monogr>
		<title level="m" type="main">Easyedit: An easy-to-use knowledge editing framework for large language models</title>
		<author>
			<persName><forename type="first">P</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Xie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Tian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Xi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Cheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Zheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Chen</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2308.07269</idno>
		<idno type="arXiv">arXiv:2308.07269</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2308.07269.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b109">
	<monogr>
		<author>
			<persName><forename type="first">Y.-S</forename><surname>Chuang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Xie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Luo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Glass</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>He</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2309.03883</idno>
		<title level="m">Dola: Decoding by contrasting layers improves factuality in large language models</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b110">
	<analytic>
		<title level="a" type="main">Neural path hunter: Reducing hallucination in dialogue systems via path grounding</title>
		<author>
			<persName><forename type="first">N</forename><surname>Dziri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Madotto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Zaïane</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">J</forename><surname>Bose</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2021.emnlp-main.168</idno>
		<ptr target="https://doi.org/10.18653/v1/2021.emnlp-main.168.doi:10.18653/v1/2021.emnlp-main.168" />
	</analytic>
	<monogr>
		<title level="m">EMNLP 2021, ACL</title>
				<editor>
			<persName><forename type="first">M</forename><surname>Moens</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">X</forename><surname>Huang</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Specia</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><forename type="middle">W</forename><surname>Yih</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="2197" to="2214" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b111">
	<monogr>
		<title level="m" type="main">ORCA: interpreting prompted language models via locating supporting data evidence in the ocean of pretraining data</title>
		<author>
			<persName><forename type="first">X</forename><surname>Han</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tsvetkov</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2205.12600</idno>
		<idno type="arXiv">arXiv:2205.12600</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2205.12600.doi:10.48550/arXiv.2205" />
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b112">
	<monogr>
		<title level="m" type="main">Rethinking with retrieval: Faithful large language model inference</title>
		<author>
			<persName><forename type="first">H</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Roth</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2301.00303</idno>
		<idno type="arXiv">arXiv:2301.00303</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2301.00303.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b113">
	<analytic>
		<title level="a" type="main">TRAK: attributing model behavior at scale</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M</forename><surname>Park</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Georgiev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ilyas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Leclerc</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Madry</surname></persName>
		</author>
		<ptr target="https://proceedings.mlr.press/v202/park23c.html" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of Machine Learning Research</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Krause</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">E</forename><surname>Brunskill</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K</forename><surname>Cho</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">B</forename><surname>Engelhardt</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Sabato</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Scarlett</surname></persName>
		</editor>
		<meeting>Machine Learning Research<address><addrLine>PMLR</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">202</biblScope>
			<biblScope unit="page" from="27074" to="27113" />
		</imprint>
	</monogr>
	<note>ICML 2023</note>
</biblStruct>

<biblStruct xml:id="b114">
	<monogr>
		<title level="m" type="main">Data portraits: Recording foundation model training data</title>
		<author>
			<persName><forename type="first">M</forename><surname>Marone</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">V</forename><surname>Durme</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2303.03919</idno>
		<idno type="arXiv">arXiv:2303.03919</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2303.03919.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b115">
	<monogr>
		<title level="m" type="main">Self-refine: Iterative refinement with self-feedback</title>
		<author>
			<persName><forename type="first">A</forename><surname>Madaan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Tandon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Gupta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Hallinan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Wiegreffe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">U</forename><surname>Alon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Dziri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Prabhumoye</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Welleck</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">P</forename><surname>Majumder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gupta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Yazdanbakhsh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Clark</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2303.17651</idno>
		<idno type="arXiv">arXiv:2303.17651</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2303.17651" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b116">
	<monogr>
		<title level="m" type="main">Reflexion: an autonomous agent with dynamic memory and self-reflection</title>
		<author>
			<persName><forename type="first">N</forename><surname>Shinn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Labash</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gopinath</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2303.11366</idno>
		<idno type="arXiv">arXiv:2303.11366</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2303.11366" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b117">
	<analytic>
		<title level="a" type="main">prompting language models improves quoting from pre-training data</title>
		<author>
			<persName><forename type="first">O</forename><surname>Weller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Marone</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Weir</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">J</forename><surname>Lawrie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Khashabi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">V</forename><surname>Durme</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.13252</idno>
		<idno type="arXiv">arXiv:2305.13252</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.13252" />
	</analytic>
	<monogr>
		<title level="m">according to</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b118">
	<analytic>
		<title level="a" type="main">Verify-and-edit: A knowledge-enhanced chain-of-thought framework</title>
		<author>
			<persName><forename type="first">R</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Joty</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Qin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bing</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2023.acl-long.320</idno>
		<ptr target="https://doi.org/10.18653/v1/2023.acl-long.320.doi:10.18653/v1/2023.acl-long.320" />
	</analytic>
	<monogr>
		<title level="m">ACL 2023, ACL</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Rogers</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><forename type="middle">L</forename><surname>Boyd-Graber</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Okazaki</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="5823" to="5840" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b119">
	<monogr>
		<title level="m" type="main">Chain-of-verification reduces hallucination in large language models</title>
		<author>
			<persName><forename type="first">S</forename><surname>Dhuliawala</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Komeili</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Raileanu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Celikyilmaz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Weston</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2309.11495</idno>
		<idno type="arXiv">arXiv:2309.11495</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2309.11495.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b120">
	<analytic>
		<title level="a" type="main">Improving language models by retrieving from trillions of tokens</title>
		<author>
			<persName><forename type="first">S</forename><surname>Borgeaud</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mensch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hoffmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Cai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Rutherford</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Millican</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Van Den Driessche</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lespiau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Damoc</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Clark</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>De Las Casas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Guy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Menick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Ring</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Hennigan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Maggiore</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Jones</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Cassirer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Brock</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Paganini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Irving</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Vinyals</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Osindero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Simonyan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">W</forename><surname>Rae</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Elsen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Sifre</surname></persName>
		</author>
		<ptr target="https://proceedings.mlr.press/v162/borgeaud22a.html" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of Machine Learning Research</title>
				<editor>
			<persName><forename type="first">K</forename><surname>Chaudhuri</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Jegelka</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Song</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Szepesvári</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><surname>Niu</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Sabato</surname></persName>
		</editor>
		<meeting>Machine Learning Research<address><addrLine>PMLR</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">2022. 2022</date>
			<biblScope unit="volume">162</biblScope>
			<biblScope unit="page" from="2206" to="2240" />
		</imprint>
	</monogr>
	<note>ICML</note>
</biblStruct>

<biblStruct xml:id="b121">
	<analytic>
		<title level="a" type="main">Interleaving retrieval with chain-ofthought reasoning for knowledge-intensive multi-step questions</title>
		<author>
			<persName><forename type="first">H</forename><surname>Trivedi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Balasubramanian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Khot</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sabharwal</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2023.acl-long.557</idno>
		<ptr target="https://doi.org/10.18653/v1/2023.acl-long.557.doi:10.18653/v1/2023.acl-long.557" />
	</analytic>
	<monogr>
		<title level="m">ACL 2023, ACL</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Rogers</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><forename type="middle">L</forename><surname>Boyd-Graber</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Okazaki</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="10014" to="10037" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b122">
	<monogr>
		<title level="m" type="main">Check your facts and try again: Improving large language models with external knowledge and automated feedback</title>
		<author>
			<persName><forename type="first">B</forename><surname>Peng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Galley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Cheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Xie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Liden</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gao</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2302.12813</idno>
		<idno type="arXiv">arXiv:2302.12813</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2302.12813" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b123">
	<monogr>
		<author>
			<persName><forename type="first">Q</forename><surname>Jin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Lu</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2304.09667</idno>
		<idno type="arXiv">arXiv:2304.09667</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2304.09667.doi:10.48550/arXiv" />
		<title level="m">Genegpt: Augmenting large language models with domain tools for improved access to biomedical information</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b124">
	<analytic>
		<title level="a" type="main">Fluid transformers and creative analogies: Exploring large language models&apos; capacity for augmenting cross-domain analogical creativity</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Ding</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Srinivasan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Macneil</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chan</surname></persName>
		</author>
		<idno type="DOI">10.1145/3591196.3593516</idno>
		<idno>doi:10.1145/3591196.3593516</idno>
		<ptr target="https://doi.org/10.1145/3591196.3593516" />
	</analytic>
	<monogr>
		<title level="m">Creativity and Cognition, C&amp;C 2023, Virtual Event</title>
				<meeting><address><addrLine>, USA</addrLine></address></meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2023">June 19-21, 2023. 2023</date>
			<biblScope unit="page" from="489" to="505" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b125">
	<monogr>
		<title level="m" type="main">Chain of knowledge: A framework for grounding large language models with structured knowledge bases</title>
		<author>
			<persName><forename type="first">X</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><forename type="middle">K</forename><surname>Chia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Ding</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bing</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">R</forename><surname>Joty</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Poria</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.13269</idno>
		<idno type="arXiv">arXiv:2305.13269</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.13269" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b126">
	<monogr>
		<title level="m" type="main">Active retrieval augmented generation</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">F</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Dwivedi-Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Callan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Neubig</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.06983</idno>
		<idno type="arXiv">arXiv:2305.06983</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.06983.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b127">
	<monogr>
		<title level="m" type="main">Gorilla: Large language model connected with massive apis</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">G</forename><surname>Patil</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">E</forename><surname>Gonzalez</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.15334</idno>
		<idno type="arXiv">arXiv:2305.15334</idno>
		<ptr target="/" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b128">
	<monogr>
		<title level="m" type="main">RETA-LLM: A retrieval-augmented large language model toolkit</title>
		<author>
			<persName><forename type="first">J</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Jin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Cheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Dou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wen</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2306.05212</idno>
		<idno type="arXiv">arXiv:2306.05212</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2306.05212.doi:10.48550/" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b129">
	<monogr>
		<title level="m" type="main">Knowledgpt: Enhancing large language models with retrieval and storage access on knowledge bases</title>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Qiu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Liang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Gu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Xiao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Wang</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2308.11761</idno>
		<idno type="arXiv">arXiv:2308.11761</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2308.11761" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b130">
	<analytic>
		<title level="a" type="main">Learning to summarize with human feedback</title>
		<author>
			<persName><forename type="first">N</forename><surname>Stiennon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Ouyang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">M</forename><surname>Ziegler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Lowe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Voss</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Radford</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Amodei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">F</forename><surname>Christiano</surname></persName>
		</author>
		<ptr target="https://proceedings.neurips.cc/paper/2020/hash/1f89885d556929e98d3ef9b86448f951-Abstract.html" />
	</analytic>
	<monogr>
		<title level="m">NeurIPS 2020</title>
				<editor>
			<persName><forename type="first">H</forename><surname>Larochelle</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Ranzato</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Hadsell</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Balcan</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Lin</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b131">
	<monogr>
		<title level="m" type="main">Teaching language models to support answers with verified quotes</title>
		<author>
			<persName><forename type="first">J</forename><surname>Menick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Trebacz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Mikulik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Aslanides</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">F</forename><surname>Song</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J</forename><surname>Chadwick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Glaese</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Young</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Campbell-Gillingham</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Irving</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Mcaleese</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2203.11147</idno>
		<idno type="arXiv">arXiv:2203.11147</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2203.11147.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b132">
	<analytic>
		<title level="a" type="main">BRIO: bringing order to abstractive summarization</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">R</forename><surname>Radev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Neubig</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2022.acl-long.207</idno>
		<ptr target="https://doi.org/10.18653/v1/2022.acl-long.207.doi:10.18653/v1/2022.acl-long.207" />
	</analytic>
	<monogr>
		<title level="m">ACL 2022, ACL</title>
				<editor>
			<persName><forename type="first">S</forename><surname>Muresan</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Nakov</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Villavicencio</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="2890" to="2903" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b133">
	<monogr>
		<title level="m" type="main">Chain of hindsight aligns language models with feedback</title>
		<author>
			<persName><forename type="first">H</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Sferrazza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Abbeel</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2302.02676</idno>
		<idno type="arXiv">arXiv:2302.02676</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2302.02676.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b134">
	<monogr>
		<title level="m" type="main">CRITIC: large language models can self-correct with tool-interactive critiquing</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Gou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Shao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Gong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Shen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Duan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Chen</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.11738</idno>
		<idno type="arXiv">arXiv:2305.11738</idno>
		<ptr target="/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b135">
	<monogr>
		<title level="m" type="main">Pad: Program-aided distillation specializes large models in reasoning</title>
		<author>
			<persName><forename type="first">X</forename><surname>Zhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Qi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Long</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Zhou</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.13888</idno>
		<idno type="arXiv">arXiv:2305.13888</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.13888" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b136">
	<monogr>
		<title level="m" type="main">Enabling large language models to generate text with citations</title>
		<author>
			<persName><forename type="first">T</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Yen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Chen</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.14627</idno>
		<idno type="arXiv">arXiv:2305.14627</idno>
		<ptr target="/" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b137">
	<analytic>
		<title level="a" type="main">Improving factuality of abstractive summarization without sacrificing summary quality</title>
		<author>
			<persName><forename type="first">T</forename><surname>Dixit</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Chen</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2023.acl-short.78</idno>
		<ptr target="https://doi.org/10.18653/v1/2023.acl-short.78.doi:10.18653/v1/2023.acl-short.78" />
	</analytic>
	<monogr>
		<title level="m">ACL 2023, ACL</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Rogers</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><forename type="middle">L</forename><surname>Boyd-Graber</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Okazaki</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="902" to="913" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b138">
	<monogr>
		<author>
			<persName><forename type="first">Z</forename><surname>Ji</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Ishii</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Fung</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2310.06271</idno>
		<title level="m">Towards mitigating hallucination in large language models via self-reflection</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b139">
	<monogr>
		<title level="m" type="main">Improving factuality and reasoning in language models through multiagent debate</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Du</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Torralba</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">B</forename><surname>Tenenbaum</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Mordatch</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.14325</idno>
		<idno type="arXiv">arXiv:2305.14325</idno>
		<ptr target="/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b140">
	<monogr>
		<title level="m" type="main">Encouraging divergent thinking in large language models through multi-agent debate</title>
		<author>
			<persName><forename type="first">T</forename><surname>Liang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Jiao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Tu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Shi</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.19118</idno>
		<idno type="arXiv">arXiv:2305.19118</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.19118.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b141">
	<monogr>
		<title level="m" type="main">Examining the inter-consistency of large language models: An in-depth analysis via debate</title>
		<author>
			<persName><forename type="first">K</forename><surname>Xiong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Ding</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Cao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Qin</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.11595</idno>
		<idno type="arXiv">arXiv:2305.11595</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.11595.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b142">
	<monogr>
		<title level="m" type="main">PRD: peer rank and discussion improve large language model based evaluations</title>
		<author>
			<persName><forename type="first">R</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Patel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Du</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2307.02762</idno>
		<idno type="arXiv">arXiv:2307.02762</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2307.02762.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b143">
	<monogr>
		<title level="m" type="main">Unleashing cognitive synergy in large language models: A task-solving agent through multi-persona self-collaboration</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Ge</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Wei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Ji</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2307.05300</idno>
		<idno type="arXiv">arXiv:2307.05300</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2307.05300.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b144">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><surname>Minsky</surname></persName>
		</author>
		<title level="m">Society of mind</title>
				<imprint>
			<publisher>Simon and Schuster</publisher>
			<date type="published" when="1988">1988</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b145">
	<analytic>
		<title level="a" type="main">A few more examples may be worth billions of parameters</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Kirstain</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">S H</forename><surname>Lewis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Riedel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Levy</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2022.findings-emnlp.72</idno>
		<ptr target="https://doi.org/10.18653/v1/2022.findings-emnlp.72.doi:10.18653/v1/2022.findings-emnlp.72" />
	</analytic>
	<monogr>
		<title level="m">Findings of EMNLP 2022, ACL</title>
				<editor>
			<persName><forename type="first">Y</forename><surname>Goldberg</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Z</forename><surname>Kozareva</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="1017" to="1029" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b146">
	<monogr>
		<title level="m" type="main">LIMA: less is more for alignment</title>
		<author>
			<persName><forename type="first">C</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Iyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Mao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Efrat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Ghosh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lewis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Levy</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.11206</idno>
		<idno type="arXiv">arXiv:2305.11206</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.11206.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b147">
	<monogr>
		<title level="m" type="main">Natural instructions: Benchmarking generalization to new tasks from natural language instructions</title>
		<author>
			<persName><forename type="first">S</forename><surname>Mishra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Khashabi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Baral</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Hajishirzi</surname></persName>
		</author>
		<idno>CoRR abs/2104.08773</idno>
		<ptr target="https://arxiv.org/abs/2104.08773.arXiv:2104.08773" />
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b148">
	<analytic>
		<title level="a" type="main">Multitask prompted training enables zero-shot task generalization</title>
		<author>
			<persName><forename type="first">V</forename><surname>Sanh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Webson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Raffel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">H</forename><surname>Bach</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Sutawika</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Alyafeai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Chaffin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Stiegler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Raja</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Dey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">S</forename><surname>Bari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">U</forename><surname>Thakker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">S</forename><surname>Sharma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Szczechla</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Chhablani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">V</forename><surname>Nayak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Datta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">T</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Manica</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Shen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><forename type="middle">X</forename><surname>Yong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Pandey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Bawden</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Neeraj</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Rozen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sharma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Santilli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Févry</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>Fries</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Teehan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">L</forename><surname>Scao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Biderman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Wolf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Rush</surname></persName>
		</author>
		<ptr target="https://openreview.net/forum?id=9Vrb9D0WI4" />
	</analytic>
	<monogr>
		<title level="m">The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event</title>
				<meeting><address><addrLine>OpenReview</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">April 25-29, 2022. 2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b149">
	<monogr>
		<author>
			<persName><forename type="first">Z</forename><surname>Bao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Xiao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Ren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Zhong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Peng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Wei</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2308.14346</idno>
		<title level="m">Disc-medllm: Bridging general large language models and real-world medical consultation</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b150">
	<monogr>
		<title level="m" type="main">Instructie: A chinese instruction-based information extraction dataset</title>
		<author>
			<persName><forename type="first">H</forename><surname>Gui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Ye</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Zhang</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.11527</idno>
		<idno type="arXiv">arXiv:2305.11527</idno>
		<ptr target="/" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b151">
	<monogr>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">Y</forename><surname>Wei Zhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<ptr target="https://github.com/michael-wzhu/ShenNong-TCM-LLM" />
		<title level="m">Shennong-tcm: A traditional chinese medicine large language model</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b152">
	<monogr>
		<author>
			<persName><forename type="first">X</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Schick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Levy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Weston</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lewis</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2308.06259</idno>
		<idno type="arXiv">arXiv:2308.06259</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2308.06259" />
		<title level="m">Self-alignment with instruction backtranslation</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b153">
	<monogr>
		<title level="m" type="main">Alpagasus: Training A better alpaca with fewer data</title>
		<author>
			<persName><forename type="first">L</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Yan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Gunaratna</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Yadav</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Tang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Srinivasan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Jin</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2307.08701</idno>
		<idno type="arXiv">arXiv:2307.08701</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2307.08701.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b154">
	<monogr>
		<title level="m" type="main">Tree of thoughts: Deliberate problem solving with large language models</title>
		<author>
			<persName><forename type="first">S</forename><surname>Yao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Shafran</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">L</forename><surname>Griffiths</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Cao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Narasimhan</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.10601</idno>
		<idno type="arXiv">arXiv:2305.10601</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.10601.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b155">
	<monogr>
		<title level="m" type="main">Cumulative reasoning with large language models</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yuan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">C</forename><surname>Yao</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2308.04371</idno>
		<idno type="arXiv">arXiv:2308.04371</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2308.04371" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b156">
	<analytic>
		<title level="a" type="main">PAL: program-aided language models</title>
		<author>
			<persName><forename type="first">L</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Madaan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">U</forename><surname>Alon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Callan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Neubig</surname></persName>
		</author>
		<ptr target="https://proceedings.mlr.press/v202/gao23f.html" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of Machine Learning Research</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Krause</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">E</forename><surname>Brunskill</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K</forename><surname>Cho</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">B</forename><surname>Engelhardt</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Sabato</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Scarlett</surname></persName>
		</editor>
		<meeting>Machine Learning Research<address><addrLine>PMLR</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">202</biblScope>
			<biblScope unit="page" from="10764" to="10799" />
		</imprint>
	</monogr>
	<note>ICML 2023</note>
</biblStruct>

<biblStruct xml:id="b157">
	<monogr>
		<title level="m" type="main">Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks</title>
		<author>
			<persName><forename type="first">W</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">W</forename><surname>Cohen</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2211.12588</idno>
		<idno type="arXiv">arXiv:2211.12588</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2211.12588.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b158">
	<monogr>
		<title level="m" type="main">When do program-of-thoughts work for reasoning?</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Bi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Deng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Zheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Chen</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2308.15452</idno>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b159">
	<analytic>
		<title level="a" type="main">Dual-process and dual-system theories of reasoning</title>
		<author>
			<persName><forename type="first">K</forename><surname>Frankish</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Philosophy Compass</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="page" from="914" to="926" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b160">
	<monogr>
		<author>
			<persName><forename type="first">K</forename><surname>Stanovich</surname></persName>
		</author>
		<title level="m">Rationality and the reflective mind</title>
				<meeting><address><addrLine>USA</addrLine></address></meeting>
		<imprint>
			<publisher>Oxford University Press</publisher>
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b161">
	<analytic>
		<title level="a" type="main">The first computational theory of mind and brain: a close look at mcculloch and pitts&apos;s &quot;logical calculus of ideas immanent in nervous activity</title>
		<author>
			<persName><forename type="first">G</forename><surname>Piccinini</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Synthese</title>
		<imprint>
			<biblScope unit="volume">141</biblScope>
			<biblScope unit="page" from="175" to="215" />
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b162">
	<analytic>
		<title/>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">L</forename><surname>Thorndike</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Animal intelligence</title>
		<imprint>
			<biblScope unit="volume">58</biblScope>
			<biblScope unit="page" from="390" to="390" />
			<date type="published" when="1898">1898</date>
		</imprint>
	</monogr>
	<note>Nature</note>
</biblStruct>

<biblStruct xml:id="b163">
	<analytic>
		<title level="a" type="main">BLIP-2: bootstrapping language-image pre-training with frozen image encoders and large language models</title>
		<author>
			<persName><forename type="first">J</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Savarese</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">C H</forename><surname>Hoi</surname></persName>
		</author>
		<ptr target="https://proceedings.mlr.press/v202/li23q.html" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of Machine Learning Research</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Krause</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">E</forename><surname>Brunskill</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K</forename><surname>Cho</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">B</forename><surname>Engelhardt</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Sabato</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Scarlett</surname></persName>
		</editor>
		<meeting>Machine Learning Research<address><addrLine>PMLR</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">202</biblScope>
			<biblScope unit="page" from="19730" to="19742" />
		</imprint>
	</monogr>
	<note>ICML 2023</note>
</biblStruct>

<biblStruct xml:id="b164">
	<monogr>
		<title level="m" type="main">Instructblip: Towards general-purpose vision-language models with instruction tuning</title>
		<author>
			<persName><forename type="first">W</forename><surname>Dai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M H</forename><surname>Tiong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Fung</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">C H</forename><surname>Hoi</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2305.06500</idno>
		<idno type="arXiv">arXiv:2305.06500</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2305.06500.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b165">
	<monogr>
		<author>
			<persName><forename type="first">Q</forename><surname>Ye</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ye</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Yan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Shi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Shi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Qi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Huang</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2304.14178</idno>
		<idno type="arXiv">arXiv:2304.14178</idno>
		<ptr target="/arXiv" />
		<title level="m">mplug-owl: Modularization empowers large language models with multimodality</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b166">
	<monogr>
		<title level="m" type="main">Tiny lvlm-ehub: Early multimodal experiments with bard</title>
		<author>
			<persName><forename type="first">W</forename><surname>Shao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Meng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Qiao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Luo</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2308.03729</idno>
		<idno type="arXiv">arXiv:2308.03729</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2308.03729.doi:10.48550/arXiv" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b167">
	<monogr>
		<title level="m" type="main">A survey on multimodal large language models</title>
		<author>
			<persName><forename type="first">S</forename><surname>Yin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Fu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Chen</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2306.13549</idno>
		<idno type="arXiv">arXiv:2306.13549</idno>
		<ptr target="/" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
