<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Improving Language Model Predictions via Prompts Enriched with Knowledge Graphs ⋆</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Ryan</forename><surname>Brate</surname></persName>
							<email>r.brate@gmail.com</email>
							<affiliation key="aff0">
								<orgName type="department" key="dep1">KNAW Humanities Cluster</orgName>
								<orgName type="department" key="dep2">Digital Humanities Lab</orgName>
								<address>
									<settlement>Amsterdam</settlement>
									<country key="NL">Netherlands</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Minh-Hoang</forename><surname>Dang</surname></persName>
							<email>minhhoangdang@hotmail.com</email>
							<affiliation key="aff1">
								<orgName type="department">Faculté des Sciences et Techniques (FST)</orgName>
								<orgName type="laboratory">LS2N</orgName>
								<orgName type="institution">Université de Nantes</orgName>
								<address>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Fabian</forename><surname>Hoppe</surname></persName>
							<email>fabian.hoppe@kit.edu</email>
							<affiliation key="aff2">
								<orgName type="department">Leibniz Institute for Information Infrastructure</orgName>
								<orgName type="institution">FIZ Karlsruhe</orgName>
								<address>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
							<affiliation key="aff3">
								<orgName type="institution" key="instit1">Karlsruhe Institute of Technology</orgName>
								<orgName type="institution" key="instit2">Institute AIFB</orgName>
								<address>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Yuan</forename><surname>He</surname></persName>
							<email>yuan.he@cs.ox.ac.uk</email>
							<affiliation key="aff4">
								<orgName type="institution">University of Oxford</orgName>
								<address>
									<country key="GB">UK</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Albert</forename><surname>Meroño-Peñuela</surname></persName>
							<affiliation key="aff5">
								<orgName type="institution">King&apos;s College London</orgName>
								<address>
									<country key="GB">UK</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Vijay</forename><surname>Sadashivaiah</surname></persName>
							<affiliation key="aff6">
								<orgName type="institution">Rensselaer Polytechnic Institute</orgName>
								<address>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Improving Language Model Predictions via Prompts Enriched with Knowledge Graphs ⋆</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">D8E49FBEE0111F8189A6B070D419F3AB</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T09:15+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Prompt Learning</term>
					<term>Pre-trained Language Model</term>
					<term>Knowledge Graph</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Despite advances in deep learning and knowledge graphs (KGs), using language models for natural language understanding and question answering remains a challenging task. Pre-trained language models (PLMs) have shown to be able to leverage contextual information, to complete cloze prompts, next sentence completion and question answering tasks in various domains. Unlike structured data querying in e.g. KGs, mapping an input question to data that may or may not be stored by the language model is not a simple task. Recent studies have highlighted the improvements that can be made to the quality of information retrieved from PLMs by adding auxiliary data to otherwise naive prompts. In this paper, we explore the effects of enriching prompts with additional contextual information leveraged from the Wikidata KG on language model performance. Specifically, we compare the performance of naive vs. KG-engineered cloze prompts for entity genre classification in the movie domain. Selecting a broad range of commonly available Wikidata properties, we show that enrichment of cloze-style prompts with Wikidata information can result in a significantly higher recall for the investigated BERT and RoBERTa large PLMs. However, it is also apparent that the optimum level of data enrichment differs between models.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Pre-trained language models (PLMs) <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2]</ref>, based on deep learning attention-based architectures, have shown to have outstanding performance at various natural language processing (NLP) tasks predicated on natural language understanding. However, the extent to which they capture domain knowledge and empirical semantics <ref type="bibr" target="#b2">[3]</ref> -i.e. the use of formal domain properties in practice -is not well understood. In this work, we narrow down the focus to cloze-style completion, the task of predicting the masked entity text in a sentence. For example, given: "The Klingons are a species in the franchise [MASK]", the PLM is expected to predict "Star Trek" for <ref type="bibr">[MASK]</ref>. It aims to extract the implicit knowledge entailed by the PLMs, since such knowledge can be used for downstream NLP applications like sentiment analysis <ref type="bibr" target="#b3">[4]</ref>, dialogue systems <ref type="bibr" target="#b4">[5]</ref>, and natural language inference <ref type="bibr" target="#b5">[6]</ref>, as well as for completing the missing information of knowledge graphs (KGs) or ontologies <ref type="bibr" target="#b6">[7]</ref>, and even constructing new ones <ref type="bibr" target="#b7">[8]</ref>.</p><p>In recent years, PLMs have improved on the state of the art in many NLP tasks by leveraging large text corpora <ref type="bibr" target="#b8">[9]</ref>, but most of time they still require annotated data for task-specific fine-tuning <ref type="bibr" target="#b9">[10]</ref>. However, the empirical semantics gathered by these models is limited to distributional aspects <ref type="bibr" target="#b10">[11]</ref>. Therefore, the performance, especially in the few-and zero-shot setting, highly depends on the provided prompt, i.e. snippets of contextual information for a specific task. However, in many cases the engineering of the prompts is naive and simplistic, giving the PLM too little context to provide an accurate answer, and unsystematic, providing little principles on how exactly these prompts need to be composed in order to have a predictable behaviour. Indeed, recent studies <ref type="bibr" target="#b11">[12]</ref> have highlighted the improvements that can be made to the quality of information retrieved from PLMs by performing amendments to these prompts. This casts doubts on some studies <ref type="bibr" target="#b12">[13]</ref> that claim that a PLM cannot answer easy questions about e.g. culture (movies, books, music, ...), it is reasonable to postulate that PLMs could perhaps answer those questions accurately if they were provided with systematically engineered prompts that contained richer contexts.</p><p>Existing approaches of prompt engineering include: (i) learn-by-example, where the prompt consists of the concatenation of correct examples we expect a PLM to predict <ref type="bibr" target="#b1">[2]</ref>; (ii) manually designed prompts of different granularities <ref type="bibr" target="#b12">[13]</ref>; (iii) automatically searched prompts optimized on few-shot samples <ref type="bibr" target="#b13">[14]</ref>, all of which rely on the implicit semantics of natural language texts. In this paper, we investigate how incorporating explicit knowledge from external sources like KGs can help prompt engineering and thus enhance the cloze-style question answering of PLMs. Specifically, we explore cloze-style prompts with respect to the movie domain in respect of the performance of the BERT and RoBERTa large PLMs.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>Studies towards prompt learning are based on the hypothesis that pre-trained language models (PLMs) have learnt abundant knowledge and just require sufficiently detailed contexts for predictions <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b9">10,</ref><ref type="bibr" target="#b14">15]</ref> -and in this way, it is possible to apply PLMs without data-driven fine-tuning. A (hard<ref type="foot" target="#foot_0">1</ref> ) prompt is the conditioning text which is combinded with the input to provide contexts or hints for the PLM. A template (i.e. pattern) is a function that integrates the inputs and prompts. Answers are then given by the PLMs conditioned on the prompts, and a further function (i.e. verbalizer) is often required to map the answers to the final outputs. The reason for that is, the prompt learning paradigm is typically formulated as a similar task to the PLM's pre-training task, which does not necessarily yield the desired outputs of downstream applications.</p><p>An important part of prompt learning is prompt engineering, i.e., to design template(s), either manually or automatically, to support downstream applications. In <ref type="bibr" target="#b1">[2]</ref>, Brown et al. proposed to use demonstrations, i.e., a sequence of input-output texts, as the prompts, expecting that the PLM can implicitly learn to predict from examples. For instance, if we want the PLM to predict the masked position in "[MASK] is the capital of China.", we can demonstrate by appending "London is the capital of the UK" after the masked sentence. Schick et al. <ref type="bibr" target="#b15">[16]</ref> manually designed different templates, each corresponding to an individual PLM trained on few-shot examples. The predictions of downstream text classification and natural language inference tasks were then made according to an ensemble of trained PLMs. Shin et al. <ref type="bibr" target="#b13">[14]</ref> argued that manually designed templates suffer from the uncertainty of guesswork or the lack of domain expertise. Therefore, they proposed to search for templates using gradient-based optimization. More recently, Lu et al. <ref type="bibr" target="#b16">[17]</ref> have shown that PLMs performance varies with the order of these prompts, and use generative language models and entropy statistics on the prompt permutations to identify prompts with good performance.</p><p>KGs or ontologies are excellent sources for providing explicit knowledge to enrich prompts or verbalizers. West et al. <ref type="bibr" target="#b17">[18]</ref> considered distilling a student model in the common sense domain from the enormously large PLM GPT-3 <ref type="bibr" target="#b1">[2]</ref>, which serves as the teacher model. They adopted the prompt learning scheme to extract triples from the teacher model with templates created and examples extracted from the common sense KG Atomic <ref type="bibr" target="#b18">[19]</ref>. Hu et al. <ref type="bibr" target="#b6">[7]</ref> argued that the label word space (i.e., the answer space) can be well expanded by adding in external knowledge about related words. They employed different refinement heuristics to shortlist candidates to benefit the downstream classification task. For instance, if some "Person" is classified as a "Physicist" in the ground truth data, then answers like "Scientist" will also be accepted.</p><p>Our work was motivated by the probing study of Penha et al. <ref type="bibr" target="#b12">[13]</ref> that investigates whether BERT (a well-known PLM consisting of stacked transformer encoders <ref type="bibr" target="#b0">[1]</ref>) actually knows superficial cultural knowledge about books, movies, and music. Cloze-style questions for classifying the genre of entities (from Wikidata) of different books, movies, and music were given for the PLM to answer, often with unsatisfying performance. However, their work considered naive prompts without sufficient contexts, while ours attempts to examine if KGs can enrich these prompts, especially giving additional contexts (e.g., attributes, 𝑘-hop neighbours) of the entities in order to help the PLM to generate better predictions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Methodology</head><p>The basic idea of our method is to use the information about entities in KGs to expand cloze-style prompts with richer entity descriptions. It is summarized in Figure <ref type="figure" target="#fig_0">1</ref>. We enrich the naive prompt, for example Die Hard is of genre [MASK], through matching the movie Die Hard to the corresponding Wikidata item and extract auxiliary knowledge with SPARQL queries, and generating an enriched prompt using this auxiliary data. We use datatype properties and verbalize entities using rdfs:label to compose valid phrases. As a result, we obtain e.g. We then use both (a) the naive prompts and (b) the KG-enriched prompts to query various language models, and compare their performance on the entity genre classification task. In the following paragraphs the enrichment by KG querying and the prompt engineering step are described in detail. Proposed approach to enrich the query using external language.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Knowledge Graph Querying</head><p>The auxiliary data for each movie is extracted from Wikidata. This is done in a simplistic two-step-process using SPARQL queries. The queries operate on a batch of input records to reduce the number of requests and avoid timeout errors.</p><p>First, the movies are linked to their respected Wikidata entities by IMDb or TMDB ID utilizing the Wikidata properties IMDb ID (wdt:P345) and TMDb movie ID (wdt:P4947). If this does not provide an entity, an exact string matching given the title is attempted as well.</p><p>SELECT ?mlId ?imdbId ?tmdbId ?movie WHERE { VALUES (?mlId ?imdbId ?tmdbId) {("1" "tt0114709" "862" ) ... } {?movie wdt:P345 ?imdbId . } UNION {?movie wdt:P4947 ?tmdbId .} } Listing 1: SPARQL query used for entity linking with the IMDb or TMDB ID.</p><p>The second step queries the entities for the auxiliary data used to enrich the prompts with additional contextual information. Overall, a set of 28 properties was extracted and investigated for each entity. A simplified version of the utilized SPARQL query is given in 2. This query can easily be adapted to query other properties by adding these properties to the ?property values. From this set of properties a subset of 10 manually selected domain-specific properties are used to constract the enriched prompts. The properties are selected based on human intuition and the most frequent co-occurrence for the given entities. Listing 2: Simplified SPARQL query used to retrieve additional movie knowledge from Wikidata.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Prompt Engineering</head><p>Similarly to <ref type="bibr" target="#b12">[13]</ref>, we consider an entity genre classification task. The prompts are of the form: "&lt;title&gt; is a movie &lt;Wikidata enrichment&gt;, of the genre <ref type="bibr">[MASK]</ref>.", where &lt;Wikidata enrichment&gt; is an aggregation of movie properties and corresponding values extracted from Wikidata pertaining the title in question, in some natural language format. Table <ref type="table">1</ref> lists the Wikidata properties used to assemble values for &lt;Wikidata enrichment&gt;.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 1</head><p>Wikidata properties used in constructing probes for the movie dataset. 'Enrichment Text' is the text adopted in the probe enrichment to describe the property in question in a more natural language format.</p><p>The Wikidata properties listed in Table <ref type="table">1</ref> are broadly ranked in descending information specificity. It was in this order, that ten variations for a probe were constructed, by sequentially adding Wikidata properties to prompts, building gradually more contextual-information dense prompts. In adding property information, only the first value of each Wikidata property was used where more than one was available (e.g., the first listed cast member). E.g., as follows; the unenriched prompt, the first two successive prompt enrichments, and the final enriched form pertaining to the movie Die Hard.:</p><p>• non-enriched prompt: Die Hard is a movie, of the genre <ref type="bibr">[</ref> </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Evaluation</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Dataset</head><p>In order to test our approach, we use the BERT <ref type="bibr" target="#b0">[1]</ref> and RoBERTa large <ref type="bibr" target="#b19">[20]</ref> pre-trained models.</p><p>The test dataset we are using is a subset of ML25M from IMDB <ref type="bibr" target="#b20">[21]</ref>. ML25M contains title and ground truth genre classification of a range of 54,758 movies. A subset of this dataset was then assembled, as those movies for which the Wikidata properties as listed in Table <ref type="table">1</ref> were present in full. This resulted in a test set of 9,596 movie titles. The Wikidata properties, and thus the corresponding data subset, were selected as a compromise between a large dataset, and a diverse set of domain-relevant Wikidata properties, following exploratory analysis of the ML25M dataset.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Results</head><p>Table <ref type="table" target="#tab_3">2</ref> lists the recall@n scores for each of the prompts described in Section 3.2, for the BERT and RoBERTa large models respectively. For a given model and prompt, recall@1 and recall@5 values for each movie are calculated as the fraction of movie ground-truth genres predicted in the highest ranked n PLM mask predictions. The aggregated recall@n values reported in Table <ref type="table" target="#tab_3">2</ref> are the micro-averaged recall@n scores across all movies in the test dataset, with respect to the model and prompt referenced. With reference to Table <ref type="table" target="#tab_3">2</ref>, certain variations of the enriched probes showed greater R@n scores that the non-enriched case, for both the BERT and RoBERTa large models, across verbalisation strategies. We compare the statistical significance of the R@n outcomes, of the highest performing enriched prompts (bolded) against the non-enriched case, via a one-tailed, directional, dependent t-test. Where the null hypothesis is that average of the R@n differences is 0, and the alternative hypothesis is that the average of the R@n differences is non-zero, biased towards the selected enriched probe. A significance of 0.05 is applied. With reference to the p-values given in Table <ref type="table">3</ref>, we can affirm with statistical significance that the enriched prompts are more performant overall. Recall@n scores for the Bert, RoBERTa large and the movie data subset, averaged over all movies. Verbalisation strategy A and B prompts consist of comma separated and 'and' separated WikiData information, respectively, as described in Section 3.2. The greatest recall@n scores are highlighted in bold.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Prompt</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>BERT</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>RoBERTa large</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Verbalisation</head><formula xml:id="formula_0">Strategy A Verbalisation Strategy B Verbalisation Strategy A Verbalisation Strategy B R@1 R@5 R@1 R@5 R@1 R@5 R@1 R@</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Discussion</head><p>The results and analysis of Section 4.2 give support to the position that, when considered enmasse, enrichment of prompts with domain-relevant information from Wikidata can improve cloze-style genre prediction in the movie domain. This is the case for both of the investigated verbalisation strategies. Note: * denotes that the p-value is 0 to at least 3 significant figures.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 3</head><p>Results for separate dependent one-tailed t-tests under the alternative hypothesis that the average difference between the enriched and non-enriched prompts is non-zero in favour of the enriched case. A p-value less than 0.05 means that we accept the alternative hypothesis with a 5% chance of Type I error.</p><p>It is noteworthy, however, that the BERT and RoBERTa large models behave very differently in terms of both their non-enriched performance and their performance when subject to varying levels of enrichment. This is demonstrative of the potential for PLM improvement via prompt enrichment as being highly specific to the model in question. BERT demonstrates optimum recall performance in aggregate for those enriched prompts with relatively low levels of information enrichment, followed by a very rapid reduction in recall@n for further enriched prompts. Whereas RoBERTa large demonstrates fluctuating performance relative to the non-enriched prompt, with the greatest performance shown in the more information-rich prompts.</p><p>It is beyond the scope of this paper to disentangle the role of information variety and the specific information types themselves, as to the influence on prediction outcomes. However, there are preliminary indications of complex interactions. For example, as shown in Table <ref type="table" target="#tab_3">2</ref>, prompt 7 (verbalisation strategy A) applied to RoBERTa large shows a huge spike in improved performance over the worst performing prompt 6, which adds the release date information. Analysis of a verbalisation strategy A prompt enriched only by release date alone, explains a large portion of the improvement (recall@1 = 0.167, recall@5 = 0.48). However, the overall context provided by prompt 7 results in the best performance overall: A one-tail dependent t test between prompt 7 and the case of enrichment by only release date, demonstrates significant non-zero differences, in the direction of greater prompt 7 performance for each of recall@1 and recall@5. Both tests reporting a p-value close to 0, with respect to a 0.05 significance. Accordingly, the results are suggestive of further investigative work being required to understand better the interactive effect of information enrichment on whatever model, domain, and task to which such enriched prompts may be applied.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusion</head><p>Given that PLMs are limited in performance for domain-specific cloze-style question answering prompts, in this paper we examine how adding additional context to naive prompts from KGs can improve the performance of PLMs on a movie genre prediction task. Through our experiments, we show a statistically significant improvement in recall on prompts enriched with information from the Wikidata KG in comparison to non-enriched prompts on the BERT and RoBERTa large PLMs.</p><p>As future work, we plan to expand our study to include more domains such as books, music etc. to better understand domain-specific optimum characteristics for enrichment, and cover the same domains as similar previous work <ref type="bibr" target="#b12">[13]</ref>. Additionally, we look forward to enriching prompts using web entities <ref type="bibr" target="#b21">[22]</ref>. These entities are embedded in HTML pages on the web using Microformat, Microdata and RDFa from the Common Crawl web corpus, the largest and most up-to-date data web corpus available to the public. As more and more websites embed structured data describing for instance products, people, organizations, places, events, resumes, and cooking recipes, the engineered prompts covered domain-specific knowledge that is not present in the encyclopedic Wikidata.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Proposed framework (a) typical querying setup for a Masked Language Model prediction. (b) Proposed approach to enrich the query using external language.</figDesc><graphic coords="4,115.54,212.52,375.03,183.65" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 2</head><label>2</label><figDesc></figDesc><table><row><cell>5</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">Soft prompts are learnt at the embedding level.</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgements</head><p>We would like to thank the International Semantic Web Summer School 2022, which initiated the collaboration between the authors in producing this paper. This work was funded in-part by: 'Culturally Aware AI' funded by NWO, the ANR-19-CE23-0014 DeKaloG project (CE23 -Intelligence artificielle) and the CominLabs MiKroloG project, Samsung Research UK. This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 101004746.</p></div>
			</div>


			<div type="availability">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Wikidata property Property Label Enrichment Text wdt:P161 cast member starring wdt:P57 director directed by wdt:P162 producer produced by wdt:P58 screenwriter screenwriter wdt:P86 composer music by wdt:P1040 film editor edited by wdt:P577 year released wdt:P750 distributed by distributed by wdt:P495 country of origin originating from</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M.-W</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Toutanova</surname></persName>
		</author>
		<idno>ArXiv abs/1810.04805</idno>
		<title level="m">Bert: Pre-training of deep bidirectional transformers for language understanding</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Language models are few-shot learners</title>
		<author>
			<persName><forename type="first">T</forename><surname>Brown</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Mann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Ryder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Subbiah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">D</forename><surname>Kaplan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Dhariwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Neelakantan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Shyam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Sastry</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Askell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Agarwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Herbert-Voss</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Krueger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Henighan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Child</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ramesh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Ziegler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Winter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Hesse</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Sigler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Litwin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gray</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Chess</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Clark</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Berner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mccandlish</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Radford</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Sutskever</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Amodei</surname></persName>
		</author>
		<ptr target="https://proceedings.neurips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf" />
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems</title>
				<editor>
			<persName><forename type="first">H</forename><surname>Larochelle</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Ranzato</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Hadsell</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Balcan</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Lin</surname></persName>
		</editor>
		<imprint>
			<publisher>Curran Associates, Inc</publisher>
			<date type="published" when="2020">2020</date>
			<biblScope unit="volume">33</biblScope>
			<biblScope unit="page" from="1877" to="1901" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Observing lod using equivalent set graphs: it is mostly flat and sparsely linked</title>
		<author>
			<persName><forename type="first">L</forename><surname>Asprino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Beek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Ciancarini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">V</forename><surname>Harmelen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Presutti</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Semantic Web Conference</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="57" to="74" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m" type="main">Adaptive prompt learning-based few-shot sentiment analysis</title>
		<author>
			<persName><forename type="first">P</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Chai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Xu</surname></persName>
		</author>
		<idno>ArXiv abs/2205.07220</idno>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m" type="main">Building a personalized dialogue system with prompt-tuning</title>
		<author>
			<persName><forename type="first">T</forename><surname>Kasahara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Kawahara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Tung</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Shinzato</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Sato</surname></persName>
		</author>
		<idno>ArXiv abs/2206.05399</idno>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Enhancing cross-lingual natural language inference by promptlearning from cross-lingual templates</title>
		<author>
			<persName><forename type="first">K</forename><surname>Qi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Wan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Du</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Chen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics</title>
		<title level="s">Long Papers</title>
		<meeting>the 60th Annual Meeting of the Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="1910" to="1923" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Knowledgeable prompt-tuning: Incorporating knowledge into prompt verbalizer for text classification</title>
		<author>
			<persName><forename type="first">S</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Ding</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Sun</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics</title>
		<title level="s">Long Papers</title>
		<meeting>the 60th Annual Meeting of the Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="2225" to="2240" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<author>
			<persName><forename type="first">B</forename><surname>Heinzerling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Inui</surname></persName>
		</author>
		<idno>ArXiv abs/2008.09036</idno>
		<title level="m">Language models as knowledge bases: On entity representations, storage capacity, and paraphrased queries</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Transfer learning in natural language processing</title>
		<author>
			<persName><forename type="first">S</forename><surname>Ruder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">E</forename><surname>Peters</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Swayamdipta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Wolf</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: Tutorials</title>
				<meeting>the 2019 conference of the North American chapter of the association for computational linguistics: Tutorials</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="15" to="18" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<author>
			<persName><forename type="first">P</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Yuan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Fu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Hayashi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Neubig</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2107.13586</idno>
		<title level="m">Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<author>
			<persName><forename type="first">T</forename><surname>Mickus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Paperno</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Constant</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Van Deemter</surname></persName>
		</author>
		<idno>ArXiv abs/1911.05758</idno>
		<title level="m">What do you mean, bert? assessing bert as a distributional semantics model</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<title level="m" type="main">How can we know what language models know?</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">F</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Araki</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Neubig</surname></persName>
		</author>
		<idno type="DOI">10.48550/ARXIV.1911.12543</idno>
		<ptr target="https://arxiv.org/abs/1911.12543.doi:10.48550/ARXIV.1911.12543" />
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">What does bert know about books, movies and music? probing bert for conversational recommendation</title>
		<author>
			<persName><forename type="first">G</forename><surname>Penha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Hauff</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Fourteenth ACM Conference on Recommender Systems</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<title level="m" type="main">Eliciting knowledge from language models using automatically generated prompts</title>
		<author>
			<persName><forename type="first">T</forename><surname>Shin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Razeghi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">L L</forename><surname>Iv</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Wallace</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Singh</surname></persName>
		</author>
		<idno>ArXiv abs/2010.15980</idno>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Noisy channel language model prompting for few-shot text classification</title>
		<author>
			<persName><forename type="first">S</forename><surname>Min</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lewis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Hajishirzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics</title>
		<title level="s">Long Papers</title>
		<meeting>the 60th Annual Meeting of the Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="5316" to="5330" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Exploiting cloze-questions for few-shot text classification and natural language inference</title>
		<author>
			<persName><forename type="first">T</forename><surname>Schick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Schütze</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume</title>
				<meeting>the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="255" to="269" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<author>
			<persName><forename type="first">Y</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Bartolo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Moore</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Riedel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Stenetorp</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2104.08786</idno>
		<title level="m">Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<author>
			<persName><forename type="first">P</forename><surname>West</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Bhagavatula</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hessel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">D</forename><surname>Hwang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">L</forename><surname>Bras</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Welleck</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Choi</surname></persName>
		</author>
		<idno>ArXiv abs/2110.07178</idno>
		<title level="m">Symbolic knowledge distillation: from general language models to commonsense models</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Atomic: An atlas of machine commonsense for if-then reasoning</title>
		<author>
			<persName><forename type="first">M</forename><surname>Sap</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">Le</forename><surname>Bras</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Allaway</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Bhagavatula</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Lourie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Rashkin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Roof</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">A</forename><surname>Smith</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Choi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the AAAI conference on artificial intelligence</title>
				<meeting>the AAAI conference on artificial intelligence</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="volume">33</biblScope>
			<biblScope unit="page" from="3027" to="3035" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<title level="m" type="main">Roberta: A robustly optimized BERT pretraining approach</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ott</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Du</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Joshi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Levy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lewis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Stoyanov</surname></persName>
		</author>
		<idno>CoRR abs/1907.11692</idno>
		<ptr target="http://arxiv.org/abs/1907.11692.arXiv:1907.11692" />
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">The movielens datasets: History and context, Acm transactions on interactive intelligent systems</title>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">M</forename><surname>Harper</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>Konstan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">tiis)</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="page" from="1" to="19" />
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<author>
			<persName><forename type="first">H</forename><surname>Mühleisen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Bizer</surname></persName>
		</author>
		<title level="m">Web data commons-extracting structured data from two large web corpora</title>
				<imprint>
			<publisher>LDOW</publisher>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
