<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Dynamic Prompting: Large Language Models for Task Oriented Dialog</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Jan</forename><surname>Nehring</surname></persName>
							<email>jan.nehring@dfki.de</email>
							<affiliation key="aff0">
								<orgName type="department">German Research Center for Artificial Intelligence (DFKI)</orgName>
								<address>
									<addrLine>Alt-Moabit 91c</addrLine>
									<postCode>10559</postCode>
									<settlement>Berlin</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Akhil</forename><surname>Juneja</surname></persName>
							<email>juneja.akhil@dfki.de</email>
							<affiliation key="aff0">
								<orgName type="department">German Research Center for Artificial Intelligence (DFKI)</orgName>
								<address>
									<addrLine>Alt-Moabit 91c</addrLine>
									<postCode>10559</postCode>
									<settlement>Berlin</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Adnan</forename><surname>Ahmad</surname></persName>
							<email>adnan.ahmad@tu-berlin.de</email>
							<affiliation key="aff1">
								<orgName type="institution">TU Berlin</orgName>
								<address>
									<addrLine>Straße des 17. Juni 135</addrLine>
									<postCode>10623</postCode>
									<settlement>Berlin</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Roland</forename><surname>Roller</surname></persName>
							<email>roland.roller@dfki.de</email>
							<affiliation key="aff0">
								<orgName type="department">German Research Center for Artificial Intelligence (DFKI)</orgName>
								<address>
									<addrLine>Alt-Moabit 91c</addrLine>
									<postCode>10559</postCode>
									<settlement>Berlin</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Dietrich</forename><surname>Klakow</surname></persName>
							<email>dietrich.klakow@lsv.uni-saarland.de</email>
							<affiliation key="aff2">
								<orgName type="institution">Saarland University</orgName>
								<address>
									<addrLine>Campus</addrLine>
									<postCode>66123</postCode>
									<settlement>Saarbrücken</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff3">
								<address>
									<addrLine>Dec 04 -06</addrLine>
									<postCode>2024</postCode>
									<settlement>Pisa</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Dynamic Prompting: Large Language Models for Task Oriented Dialog</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">6BF92B30001C88A41F823CB6FFC9C808</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T17:34+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Dialog Systems</term>
					<term>Large Language Models</term>
					<term>Task-Oriented Dialog</term>
					<term>Dynamic Prompting</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Large Language Models show impressive results in many different applications, most notably in the context of questionanswering and open dialog situations. However, it is still an open question how to use those models for task-oriented dialogs such as booking or customer information systems, and such. In this work, we propose Dynamic Prompting, an architecture for task-oriented dialog, integrating the benefits of Large Language Models and showcasing the approach on the MultiWOZ 2.2 dataset. Our architecture leads to a high task success rate, provides sensible and specific answers, and is resistant to hallucinations. Further, we show that Dynamic Prompting is able to answer questions that were not anticipated by the dialog systems designer and that it can correct several types of errors and other characteristics of the system.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Task-Oriented Dialog Systems (TODS) assist users in completing a task within a conversation <ref type="bibr" target="#b0">[1]</ref>, for instance, in the context of customer information and bookings (train/restaurant). In an applied setting with real users, it is important that those systems provide correct answers, tasks can be quickly solved, and lead ideally to high user satisfaction. To ensure this, TODS often provide a high level of control over its dialog management and answer behavior for system developers. Existing solutions normally either manually implement a dialog manager to control the complete interaction, or train it on large amounts of dialog interactions <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b2">3,</ref><ref type="bibr" target="#b3">4,</ref><ref type="bibr" target="#b4">5]</ref>.</p><p>In contrast, Large Language Models (LLMs) are very good at open-domain dialog and provide fluent and convincing messages in different styles. However, those answers might be misleading and even false (hallucination) <ref type="bibr" target="#b5">[6,</ref><ref type="bibr" target="#b6">7,</ref><ref type="bibr" target="#b7">8]</ref>. In task-oriented dialog, the model could possibly 'break out' of the given dialog task.</p><p>Using LLMs for task-oriented dialog is still in its infancy. Madotto et al. <ref type="bibr" target="#b8">[9]</ref> used LLMs for the whole pipeline of Natural Language Understanding, Dialog State Tracking, Dialog Policy and Natural Language Generation. Hudeček and Dusek <ref type="bibr" target="#b9">[10]</ref> expand on this idea by evaluating the abilities of LLMs to generate complete taskoriented multi-turn dialogs. They used LLMs for NLU and DST also but, unlike our work, they used a static prompt. Other approaches to LLMs for task-oriented dialog are presented by Cao <ref type="bibr" target="#b10">[11]</ref>, Hu et al. <ref type="bibr" target="#b11">[12]</ref>, Wei et al. <ref type="bibr" target="#b12">[13]</ref>, Li et al. <ref type="bibr" target="#b13">[14]</ref>.</p><p>To address those limitations and concerns, we propose Dynamic Prompting, a technique to combine a traditional task-oriented dialog system pipeline with the benefits of LLMs. Showcased and tested in the context of restaurant booking, we present the advantages and limitations of our approach.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Experimental Setup</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Dataset</head><p>For our experiments, we use the Multi-domain Wizard-of-Oz 2.2 (MultiWOZ) dataset <ref type="bibr" target="#b14">[15,</ref><ref type="bibr" target="#b15">16]</ref>. The dataset contains conversations between users and task-oriented chatbots that were collected using the Wizard-of-Oz <ref type="bibr" target="#b16">[17]</ref> method. For our setup, we only use conversations about the restaurant domain. In the conversations, users ask chatbots about restaurants in Cambridge, England. The systems can recommend various restaurants based on attributes such as the type of food, location or price range. Also, the system can perform bookings. Table <ref type="table" target="#tab_0">1</ref> shows an example dialog from the dataset. The restaurant domain encompasses two intents (find, book) and ten slots. For each user utterance, the dataset contains annotations for intent, slots in the utterance and dialog state (detected slots over the whole dialog). </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Dialog System Architecture</head><p>Task-Oriented Dialog Systems (TODS) consist, traditionally, of the following main components <ref type="bibr" target="#b0">[1]</ref>: Natural Language Understanding (NLU), Dialog State Tracking (DST), Dialog Manager (DM) and Response Generation (RG). The NLU performs intent detection and slot filling on the current user utterance, whereas DST keeps track of the slots over the whole dialog. The DM selects the next action of the bot based on the results of NLU and DST. Lastly, RG constructs the response to the user.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Dynamic Prompting</head><p>In the following, we introduce dynamic prompting, a TODS architecture, extended by the capabilities of an LLM. Figure <ref type="figure" target="#fig_0">1</ref> shows the architecture. We chose to use a trained model for the NLU component to handle intent recognition and entity extraction, as Hudeček and Dusek <ref type="bibr" target="#b9">[10]</ref> highlighted the limited performance of LLMs in these tasks. For NLU, we use the RASA NLU component, powered by the DIET classifier <ref type="bibr" target="#b17">[18]</ref>, while for DST, we use a simple hashmap that stores the most recent NLU results. We trained the NLU component on user utterances only from the restaurant domain of the train split of the dataset, focusing on the find and book intents. During training, we also provided use case-specific entities, including categorical (pricerange, area, bookday, bookpeople), as well as non-categorical (food, name, booktime, address, phone, postcode, and reference). We replace/extend the Dialog Manager and Response Generation with a Prompt Generation and an LLM. The prompt generation uses input from the DST and NLU and uses a series of rules, similar to a dialog manager. Also, it fetches data from a database and generates a prompt. It uses prompting templates that consist of three parts: 1) A general task description, 2) content from the database, NLU states in JSON format, and 3) the previous conversation in a theater script style. Table <ref type="table">2</ref> shows an example prompt of our system in the restaurant domain with the detected intent 'book restaurant'. The prompt is then sent to a LLM to generate a corresponding user reply. In our experiment, we use GPT-3.5-Turbo (ChatGPT) by accessing the model via API calls, as it has demonstrated leading performance in the results presented by Hudeček and Dusek <ref type="bibr" target="#b9">[10]</ref>.</p><p>Appendix A shows the prompting templates and the rules that we developed for our prompt generator. Our prompt construction approach involved multiple iterations of 'trial and error' process on the on training set, evaluating their effectiveness based on the system's task completion and relevance to the conversation. Initially, we introduced single instructions in the prompts. However, scenarios such as having no available restaurants, multiple options, or booking a restaurant required more specific instructions. This led us to implement dynamic prompts with tailored rules for each situation.</p><p>Assist the user in booking a restaurant. Always assume the restaurant is available to confirm a successful booking. Provide a reference number when the restaurant name, bookday, bookpeople, and booktime are given. Prompt if these details are missing. Omit information about fictional bookings.</p><p>Dialog State: {"food": "mediterranean", "pricerange": "expensive", bookday": "wednesday, bookpeople": "7", booktime": "15:15"} Conversation History User: Can you find me an expensive place serves panasian food? Bot: i haven't found any expensive panasian restaurants. should i try another price range? User: Are there restaurants that serve Mediterranean food? Bot: I have a couple. Both are in the centre. la mimosa, shiraz restaurant User: I am interested in La Mimosa. Please book a table for 7 at 15:15 on Wednesday. Thanks for your help!</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 2</head><p>Example prompt of Dynamic Prompting, which is sent to ChatGPT. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.">Evaluation</head><p>In our experiment, the chatbot generates a response using our dynamic prompting system for each dialog turn. We evaluated its performance on the test split of the dataset's restaurant domain. To evaluate the responses on different levels, we label them by two human annotators, given the following criteria. The annotation guidelines can be found in the supplementary materials.</p><p>• Task Success Rate describes the ratio of successful dialogs to the total number of dialogs. Following the definition of Wen et al. <ref type="bibr" target="#b18">[19]</ref> and Nekvinda and Dušek <ref type="bibr" target="#b19">[20]</ref>, we mark a dialog as successful if 1) the system's recommendation aligns with the user's requests (such as price range, location, and cuisine) and 2) the system adequately addressed subsequent requests by the user, such as providing the telephone number or confirming a successful booking. • Prompt instruction performance, a binary scale to assess whether responses aligned with the prompt instructions. • Information Extraction Performance, a binary scale, if the system is able to fetch the relevant address from the JSON information. • Response slot accuracy, the ratio of correctly predicted slot values and the number of slot values in the response. It measures if our system is able to return all desired slots to the user. We compute ratios across all annotated turns from these metrics. • Sensibleness describes if the utterance makes sense given the context <ref type="bibr" target="#b20">[21,</ref><ref type="bibr" target="#b21">22]</ref>.</p><p>• Specificity describes if the utterance is specific regarding the context <ref type="bibr" target="#b20">[21,</ref><ref type="bibr" target="#b21">22]</ref>. LMs are used to generate unspecific answers such as "this is great", which are sensible but not desired. • Interestingness describes if the utterance captures someone's attention, arouses curiosity or exhibits traits such as unexpectedness, wit, or insightfulness <ref type="bibr" target="#b21">[22]</ref>. Interestingness contributes to a compelling and engaging user experience.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Results</head><p>Table <ref type="table">3</ref> shows the task success rate of our system compared to other TODS on the MultiWOZ 2.2 dataset. Although the other systems use the whole dataset and, thus, are not perfectly comparable to ours, it still shows that Dynamic Prompting has a similar performance compared to SOTA systems. This is remarkable, particularly as we use a relatively simple NLU component, which by itself might produce errors. However, if we do not use the NLU system of our pipeline but instead use the entity annotations from the dataset, we get a 'perfect' NLU without any errors. In this case, our Dynamic Prompting achieves a task Success Rate of 0.94 -which highlights the efficiency of the LLM solution.</p><p>Table <ref type="table">5</ref> shows further performance metrics. The dialog success rate is supported by the high sensibility and specificity scores, which indicate that the system answers on point and does not deviate from the dialog's goal. However, the response slot accuracy is only 80% and needs to be improved -but this is not the focus of this work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Extracting information from the database works almost</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>System</head><p>Task Success Yang et al. <ref type="bibr" target="#b22">[23]</ref> 0.83 Lee <ref type="bibr" target="#b2">[3]</ref> 0.80 Su et al. <ref type="bibr" target="#b23">[24]</ref> 0.85 Dynamic Prompting 0.81 perfect NLU + Dynamic Prompting 0.94</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 3</head><p>Comparison of Task Success Rates on MultiWOZ 2.2 data, with an inter-annotator agreement of 1 for Dynamic Prompting.</p><p>perfectly (Information Extraction Performance=0.98). Although the system does not always follow all instructions from the prompt (Prompt Instruction Performance=0.82), the task success is still quite high, so we assume that only minor errors cause the relatively low Prompt Instruction Performance.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Qualitative Analysis</head><p>In the following, we analyze the conversations and, particularly, the generated responses of our Dynamic Prompting in more detail.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.1.">Handling Unusual Requests</head><p>In one situation the user asked to send the information via email, which the designers of the original dataset did not anticipate. In those situations, traditional dialog systems then can only answer with "I did not understand". Our approach instead was able to produce a sensible response, although it has never been trained for this case (see Table <ref type="table">4</ref>).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.2.">Politeness and Engagement</head><p>Similar to our findings in Section 3.1.4, the responses of our system are not only longer but also more engaging compared to the ground truth. For example, in one situation, our system produced an answer such as "You're welcome! If you have any more questions or need further assistance, feel free to ask. Have a great day too!" while the crowd worker wrote only "Thank you. Goodbye". Overall, we counted 'polite' phrases in the responses and found out that dynamic prompting uses them more often than the ground truth, such as "enjoy your meal" (15.5 more often), "have a great day" (2.2), "you're welcome" (4.8), "certainly!" (61.0), "great!" (20.0). Table <ref type="table">9</ref> in the appendix shows more detailed examples.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.3.">Formatting Addresses and Names</head><p>The database entries are formulated in a different format. Names are often lowercase, and the crowd workers did not correct this issue when they wrote the system responses. Also, postcodes are stored in the format "cb17aa" in the database, although the correct format would be "CB1 7AA" in the Cambridge area. Our approach consistently fixes these errors out of the box.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.4.">Diverse Responses</head><p>Dynamic Prompting produces responses that are, on average, 2.41 times longer and more diverse than the responses of the crowd workers in the WOZ dataset, with lexical diversity measured by an MTLD score <ref type="bibr" target="#b24">[25]</ref> of 80.41 compared to 72.26 for the WOZ dataset. We assume that the crowd workers were interested in providing fast and minimalistic answers. However, while diverse replies might be considered as positive as they make the interaction more enjoyable, it needs to be determined if long answers are preferable over short answers containing all necessary information. Table <ref type="table">7</ref> in the appendix shows examples of this behavior.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.5.">Adaptive NLU Error Recovery</head><p>We detected a user dialog in which the NLU failed to detect the correct area. The user asked for a restaurant in the center of the city, but due to the NLU error, the prompt generator wrote two restaurants in the prompt, one in the center and one in the south. The LLM was able to detect that the user wanted a restaurant in the center and presented only the correct restaurant in its response (see Table <ref type="table" target="#tab_4">8</ref> in the Appendix).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.6.">Error Analysis</head><p>Our system produced two types of problems. First, it was not able to display the total count of available restaurants correctly. Although the total count of available restaurants is encoded in the JSON prompt, the LLM opted to count the number of elements within the JSON options array instead.</p><p>The second common problem was the generation of fake booking reference numbers. At the end of the dialog, users often ask for a booking reference number. We do not include booking reference numbers in the prompt. Therefore, the system hallucinated booking reference numbers. We evaluated including reference numbers in the JSON format in the database section of the prompt on ten samples, and we eliminated all occurrences of fake reference number generation in our small sample set. Still, we did not include this in the whole dataset, so our results serve as an indicator only that this approach could solve the problem of hypothetical booking reference number generation.</p><p>Prompt: Assist the user in finding a restaurant. A single restaurant matches the user's criteria. Provide requested details of the restaurant. If no specific request is made, offer general information like food type, area, and price range. Respond in a coherent, non-numbered, and fluent text format based only on the conversation history, state, and database information. State: restaurant {"area": "centre", "food": "italian", "name": "stazione restaurant and coffee bar", "bookday": "saturday", "bookpeople": "1", "booktime": "12:45"} Database: restaurant {Count:1, Options: {"address": "Market Hill City Centre", "area": "centre", "food": "italian", "id": "19240", "name": "stazione restaurant and coffee bar", "phone": "01223352607", "postcode": "cb23nj", "pricerange": "expensive", "type": "restaurant"}} User:</p><p>Terrific! You are great, please send me all this information to my e-mail, thanks.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>TOD:</head><p>I apologize, but as a language model AI, I don't have the capability to send emails. However, you can easily save or take a screenshot of this conversation for future reference. Remember to have a great time at stazione restaurant and coffee bar on Saturday at 12:45! Truth:</p><p>You're welcome, was there anything else you need?</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 4</head><p>Showcasing the ability of the system to handle unusual requests. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Metric</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 5</head><p>The table shows the scores and the interannotator agreement (IA, Cohen Kappa) of the quantitative analysis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Conclusion</head><p>We In addition, our system shows various ways to correct errors, such as NLU errors, user requests not anticipated by the designer of DS, and errors in the format of the database entries. Moreover, the generated system answers are more diverse (Section 3.1.4) and more polite (Section 3.1.2) than the human-generated responses in the dataset. We would like to examine these qualitative results in future research in a more quantitative way.</p><p>Overall, we find that the widespread problem of hallucinations in LLMs is not an issue in our system as long as we present the correct information to the LLM. As soon as the user asks the system for information that is not present in the prompt, such as the booking reference numbers, the LLM starts to hallucinate.</p><p>Although we assess the system's performance solely on the restaurant domain, the dynamic prompting method can be extended to other domains in the Multi-WOZ 2.2 dataset, such as hotel, taxi, and train. Expanding to new domains will require updating the prompt generation module to accommodate new intents and state values, ensuring smooth integration with these additional domains. I am looking at a place to eat that serves venetian food in the centre.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>intent</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>TOD:</head><p>I apologize, but it seems that there are no restaurants in our database that serve Venetian food in the centre at the moment. However, I can suggest a few alternatives for you. You might consider trying some other types of cuisine such as Italian, seafood, or Mediterranean, which are commonly found in the centre. Alternatively, you could try searching for Venetian restaurants in nearby areas or expanding your search to include restaurants that are not strictly in the centre.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Truth:</head><p>Unfortunately, I don't see any restaurants that serve Venetian in the centre of town. Would you like to try a different area or type of cuisine?</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 7</head><p>Showcasing the ability of the system to generate diverse responses for a similar scenario.</p><p>Prompt: Assist the user in finding a restaurant. Numerous restaurants match the user's criteria. Ask for missing details of food type, area, or price range to narrow down the search before providing the restaurant options. Respond in a coherent, non-numbered, and fluent text format based only on the conversation history, state, and database information. State: restaurant {"food": "modern european", "pricerange": "moderate"} Database: restaurant {Count:5, Options: {"address": "152 -154 Hills Road", "area": "south", "food": "modern european", "id": "14731", "name": "restaurant alimentum", "phone": "01223413000", "postcode": "cb28pb", "pricerange": "moderate", "signature": "slowroast sirloin of beef red onion celeriac and garlic", "type": "restaurant"}, {"address": "83 Regent Street", "area": "centre", "food": "modern european", "id": "3697", "name": "de luca cucina and bar", "phone": "01223356666", "postcode": "cb21aw", "pricerange": "moderate", "signature": "roasted barbary duck breast served with sweet potato wedges and mange tout with a red wine sauce", "type": "restaurant"} User: Doesn't really matter. Is there one in the centre? TOD:</p><p>Yes, there is a modern European restaurant called "De Luca Cucina and Bar" located in the centre at 83 Regent Street. They offer a variety of dishes including their signature dish of roasted Barbary duck breast served with sweet potato wedges and mange tout with a red wine sauce. The price range is moderate. Would you like more information about this restaurant? Truth:</p><p>Yes, there are 2: De Luca Cucina and Bar &amp; Riverside Brasserie. Would you like further information? Thank you. Goodbye.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 9</head><p>Showcasing the ability of system to generate polite and engaging responses.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Processing pipeline of Dynamic Prompting</figDesc><graphic coords="3,89.29,84.19,426.00,186.30" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>MultiWOZ 2.2 example dialog.</figDesc><table><row><cell cols="2">User: Can you find me an expensive place serves</cell></row><row><cell></cell><cell>panasian food?</cell></row><row><cell>Bot:</cell><cell>i haven't found any expensive panasain</cell></row><row><cell></cell><cell>restaurants. should i try another price</cell></row><row><cell></cell><cell>range?</cell></row><row><cell cols="2">User: Are there restaurants that serve Mediter-</cell></row><row><cell></cell><cell>ranean food?</cell></row><row><cell>Bot:</cell><cell>I have a couple. Both are in the centre. la</cell></row><row><cell></cell><cell>mimosa, shiraz restaurant</cell></row><row><cell cols="2">User: I am interested in La Mimosa. Please book</cell></row><row><cell></cell><cell>a table for 7 at 15:15 on Wednesday. Thanks</cell></row><row><cell></cell><cell>for your help!</cell></row><row><cell>Bot:</cell><cell>Booking was successful. The table will be</cell></row><row><cell></cell><cell>reserved for 15 minutes. Reference number</cell></row><row><cell></cell><cell>is : ER5DFE50 . Is there anything else I can</cell></row><row><cell></cell><cell>help you with?</cell></row><row><cell cols="2">User: No, that's it. Thank you so much. Good-</cell></row><row><cell></cell><cell>bye.</cell></row><row><cell>Bot:</cell><cell>Have a great day!</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 6</head><label>6</label><figDesc>Assist the user in finding a restaurant. Numerous restaurants match the user's criteria.Ask  for missing details of food type, area, or price range to narrow down the search before providing the restaurant options. Respond in a coherent, non-numbered, and fluent text format based only on the conversation history, state, and database information. {State + Database + Conversation History} len(database) ≥ 1: Assist the user in finding a restaurant. Multiple restaurants match the user's criteria. Provide requested details from the available restaurant options. If no specific request is made, offer general information like food type, area, and price range. Respond in a coherent, non-numbered, and fluent text format based only on the conversation history, state, and database information. Assist the user in finding a restaurant. A single restaurant matches the user's criteria. Provide requested details of the restaurant. If no specific request is made, offer general information like food type, area, and price range. Respond in a coherent, non-numbered, and fluent text format based only on the conversation history, state, and database information. Assist the user in finding a restaurant. No restaurant matches the user's criteria. Suggest trying different food types, areas, or prices. Respond in a coherent, non-numbered, and fluent text format based only on the conversation history, state, and database information. Prompting templates that we developed for our system. Assist the user in finding a restaurant. No restaurant matches the user's criteria. Suggest trying different food types, areas, or prices. Respond in a coherent, non-numbered, and fluent text format based only on the conversation history, state, and database information.</figDesc><table><row><cell>== find_restaurant:</cell></row><row><cell>len(database) ≥ 5:</cell></row><row><cell>{State + Database + Conversation History}</cell></row><row><cell>len(database) == 1:</cell></row><row><cell>{State + Database + Conversation History}</cell></row><row><cell>len(database) == 0:</cell></row><row><cell>{State + Database + Conversation History}</cell></row><row><cell>intent == book_restaurant:</cell></row><row><cell>Assist the user in booking a restaurant. Always assume the restaurant is available to confirm a</cell></row><row><cell>successful booking. Provide a reference number when the restaurant name, bookday, bookpeople, and</cell></row><row><cell>booktime are given. Prompt if these details are missing. Omit information about fictional bookings.</cell></row><row><cell>{State + Conversation history}</cell></row><row><cell>intent == none:</cell></row><row><cell>You're assisting the user in finding and booking a restaurant. Provide helpful information and</cell></row><row><cell>guidance based on their requests.</cell></row><row><cell>{Conversation history}</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 8</head><label>8</label><figDesc>Showcasing the ability of system to adapt to NLU error. Yes, there are restaurants that serve Mediterranean food. The options available are "La Mimosa" and "Shiraz Restaurant." Both of them are located in the center area of town. "La Mimosa" is situated on Thompsons Lane in Fen Ditton, and you can reach them at 01223362525. On the other hand, "Shiraz Restaurant" is located at 84 Regent Street in the City Centre, with a contact number of 01223307581. Both restaurants offer Mediterranean cuisine and fall under the expensive price range. Let me know if you need any additional information! Truth:I have a couple. Both are in the centre. la mimosa, shiraz restaurant. Hi, I'm looking for a restaurant with modern european food. It should be in the moderate price range.TOD:Sure, I can help you with that. Can you please provide me with any specific area you prefer or is it fine if I search for restaurants in any area? Truth:What area are you wanting to be in? Thank you so much. That is all I need for now. Have a great day.TOD:You're welcome! If you have any more questions or need further assistance, feel free to ask. Have a great day too! Truth:</figDesc><table><row><cell>Example 1:</cell><cell></cell></row><row><cell>User:</cell><cell>Are there restaurants that serve Mediterranean food?</cell></row><row><cell>TOD:</cell><cell></cell></row><row><cell>Example 2:</cell><cell></cell></row><row><cell>User:</cell><cell></cell></row><row><cell>Example 3:</cell><cell></cell></row><row><cell>User:</cell><cell></cell></row></table></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgements</head><p>This work has been supported by the Federal Joint Committee of Germany (Gemeinsamer Bundesausschuss) as part of the project smartNTX (01NVF21116).</p></div>
			</div>

			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Appendix</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Prompting templates</head><p>Table <ref type="table">6</ref> shows some prompting templates that have been developed for our system.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Example Conversations</head><p>Tables <ref type="table">7 -9</ref> show different examples conversations.</p></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<author>
			<persName><forename type="first">D</forename><surname>Jurafsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">H</forename><surname>Martin</surname></persName>
		</author>
		<ptr target="https://web.stanford.edu/~jurafsky/slp3/ed3bookfeb3_2024.pdf" />
		<title level="m">Speech and Language Processing (Third Edition draft</title>
				<imprint>
			<date type="published" when="2024-03-10">2024. 2024-3-10</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Galaxy: A generative pre-trained model for task-oriented dialog with semi-supervised learning and explicit policy injection</title>
		<author>
			<persName><forename type="first">W</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Dai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Cao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Si</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the AAAI Conference on Artificial Intelligence</title>
				<meeting>the AAAI Conference on Artificial Intelligence</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Improving end-to-end task-oriented dialog system with a simple auxiliary task</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Lee</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2021.findings-emnlp.112</idno>
		<ptr target="https://aclanthology.org/2021.findings-emnlp.112.doi:10.18653/v1/2021.findings-emnlp.112" />
	</analytic>
	<monogr>
		<title level="m">Findings of the Association for Computational Linguistics: EMNLP 2021, Association for Computational Linguistics</title>
				<editor>
			<persName><forename type="first">M.-F</forename><surname>Moens</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">X</forename><surname>Huang</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Specia</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><forename type="middle">W</forename><surname>.-T</surname></persName>
		</editor>
		<editor>
			<persName><surname>Yih</surname></persName>
		</editor>
		<meeting><address><addrLine>Punta Cana, Dominican Republic</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="1296" to="1303" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Mars: Modeling context &amp; state representations with contrastive learning for end-to-end task-oriented dialog</title>
		<author>
			<persName><forename type="first">H</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Bao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>He</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2023.findings-acl.708</idno>
		<ptr target="https://aclanthology.org/2023.findings-acl.708.doi:10.18653/v1/2023.findings-acl.708" />
	</analytic>
	<monogr>
		<title level="m">Findings of the Association for Computational Linguistics: ACL 2023, Association for Computational Linguistics</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Rogers</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Boyd-Graber</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Okazaki</surname></persName>
		</editor>
		<meeting><address><addrLine>Toronto, Canada</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="11139" to="11160" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<author>
			<persName><forename type="first">Q</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Alnuhait</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Yu</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2305.13710</idno>
		<title level="m">Using textual interface to align external knowledge for end-to-end task-oriented dialogue systems</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Contrastive learning reduces hallucination in conversations</title>
		<author>
			<persName><forename type="first">W</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Shi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Ren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>De Rijke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Ren</surname></persName>
		</author>
		<idno type="DOI">10.1609/aaai.v37i11.26596</idno>
		<ptr target="https://doi.org/10.1609/aaai.v37i11.26596.doi:10.1609/aaai.v37i11.26596" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence, AAAI&apos;23/IAAI&apos;23/EAAI&apos;23</title>
				<meeting>the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence, AAAI&apos;23/IAAI&apos;23/EAAI&apos;23</meeting>
		<imprint>
			<publisher>AAAI Press</publisher>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">A multitask, multilingual, multimodal evaluation of ChatGPT on reasoning, hallucination, and interactivity</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Bang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Cahyawijaya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Dai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Su</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Wilie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Lovenia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Ji</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Chung</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><forename type="middle">V</forename><surname>Do</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Fung</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2023.ijcnlp-main.45</idno>
		<ptr target="https://aclanthology.org/2023.ijcnlp-main.45.doi:10.18653/v1/2023.ijcnlp-main.45" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics</title>
				<editor>
			<persName><forename type="first">J</forename><forename type="middle">C</forename><surname>Park</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Y</forename><surname>Arase</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">B</forename><surname>Hu</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">W</forename><surname>Lu</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">D</forename><surname>Wijaya</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Purwarianti</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><forename type="middle">A</forename><surname>Krisnadhi</surname></persName>
		</editor>
		<meeting>the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics<address><addrLine>Nusa Dua, Bali</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="675" to="718" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Survey of hallucination in natural language generation</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Ji</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Frieske</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Su</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Ishii</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><forename type="middle">J</forename><surname>Bang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Madotto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Fung</surname></persName>
		</author>
		<idno type="DOI">10.1145/3571730</idno>
		<idno>doi:</idno>
		<ptr target="10.1145/3571730" />
	</analytic>
	<monogr>
		<title level="j">ACM Comput. Surv</title>
		<imprint>
			<biblScope unit="volume">55</biblScope>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">Language models as few-shot learner for task-oriented dialogue systems</title>
		<author>
			<persName><forename type="first">A</forename><surname>Madotto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Fung</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2008.06239</idno>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Are large language models all you need for task-oriented dialogue?</title>
		<author>
			<persName><forename type="first">V</forename><surname>Hudeček</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Dusek</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2023.sigdial-1.21</idno>
		<ptr target="https://aclanthology.org/2023.sigdial-1.21.doi:10.18653/v1/2023.sigdial-1.21" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Association for Computational Linguistics</title>
				<editor>
			<persName><forename type="first">S</forename><surname>Stoyanchev</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Joty</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">D</forename><surname>Schlangen</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">O</forename><surname>Dusek</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Kennington</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Alikhani</surname></persName>
		</editor>
		<meeting>the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Association for Computational Linguistics<address><addrLine>Prague, Czechia</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="216" to="228" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<author>
			<persName><forename type="first">L</forename><surname>Cao</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2308.08043</idno>
		<title level="m">Diaggpt: An llm-based and multi-agent dialogue system with automatic topic management for flexible task-oriented dialogue</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<author>
			<persName><forename type="first">Z</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Feng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Deng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S.-K</forename><surname>Ng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">T</forename><surname>Luu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Hooi</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2309.08949</idno>
		<title level="m">Enhancing large language model induced task-oriented dialogue systems through look-forward motivated goals</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<title level="m" type="main">Leveraging large language models to power chatbots for collecting user self-reported data</title>
		<author>
			<persName><forename type="first">J</forename><surname>Wei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Jung</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y.-H</forename><surname>Kim</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2301.05843</idno>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Guiding large language models via directional stimulus prompting</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Peng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Galley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Yan</surname></persName>
		</author>
		<ptr target="https://proceedings.neurips.cc/paper_files/paper/2023/file/c5601d99ed028448f29d1dae2e4a926d-Paper-Conference.pdf" />
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Oh</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>Neumann</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Globerson</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K</forename><surname>Saenko</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Hardt</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Levine</surname></persName>
		</editor>
		<imprint>
			<publisher>Curran Associates, Inc</publisher>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">36</biblScope>
			<biblScope unit="page" from="62630" to="62656" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">MultiWOZ -a large-scale multi-domain Wizard-of-Oz dataset for task-oriented dialogue modelling</title>
		<author>
			<persName><forename type="first">P</forename><surname>Budzianowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T.-H</forename><surname>Wen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B.-H</forename><surname>Tseng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Casanueva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ultes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Ramadan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gašić</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/D18-1547</idno>
		<ptr target="https://aclanthology.org/D18-1547.doi:10.18653/v1/D18-1547" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics</title>
				<editor>
			<persName><forename type="first">E</forename><surname>Riloff</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">D</forename><surname>Chiang</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Hockenmaier</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Tsujii</surname></persName>
		</editor>
		<meeting>the 2018 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics<address><addrLine>Brussels, Belgium</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="5016" to="5026" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">MultiWOZ 2.2 : A Dialogue Dataset with Additional Annotation Corrections and State Tracking Baselines</title>
		<author>
			<persName><forename type="first">X</forename><surname>Zang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rastogi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Sunkara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Gupta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chen</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2020.nlp4convai-1.13</idno>
		<ptr target="https://aclanthology.org/2020.nlp4convai-1.13.doi:10.18653/v1/2020.nlp4convai-1.13" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI, Association for Computational Linguistics</title>
				<meeting>the 2nd Workshop on Natural Language Processing for Conversational AI, Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="109" to="117" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">An iterative design methodology for user-friendly natural language office information applications</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">F</forename><surname>Kelley</surname></persName>
		</author>
		<idno type="DOI">10.1145/357417.357420</idno>
		<idno>doi:10. 1145/357417.357420</idno>
		<ptr target="https://doi.org/10.1145/357417.357420" />
	</analytic>
	<monogr>
		<title level="j">ACM Trans. Inf. Syst</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page" from="26" to="41" />
			<date type="published" when="1984">1984</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<author>
			<persName><forename type="first">T</forename><surname>Bunk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Varshneya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Vlasov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Nichol</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2004.09936</idno>
		<title level="m">DIET: Lightweight language understanding for dialogue systems</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">A network-based end-to-end trainable task-oriented dialogue system</title>
		<author>
			<persName><forename type="first">T.-H</forename><surname>Wen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Vandyke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Mrkšić</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gašić</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">M</forename><surname>Rojas-Barahona</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P.-H</forename><surname>Su</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ultes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Young</surname></persName>
		</author>
		<ptr target="https://aclanthology.org/E17-1042" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1</title>
		<title level="s">Association for Computational Linguistics</title>
		<editor>
			<persName><forename type="first">M</forename><surname>Lapata</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Blunsom</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Koller</surname></persName>
		</editor>
		<meeting>the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1<address><addrLine>Long Papers; Valencia, Spain</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="438" to="449" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Shades of BLEU, flavours of success: The case of MultiWOZ</title>
		<author>
			<persName><forename type="first">T</forename><surname>Nekvinda</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Dušek</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2021.gem-1.4</idno>
		<ptr target="https://aclanthology.org/2021.gem-1.4.doi:10.18653/v1/2021.gem-1.4" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2021), Association for Computational Linguistics</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Bosselut</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">E</forename><surname>Durmus</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">V</forename><forename type="middle">P</forename><surname>Gangal</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Gehrmann</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Y</forename><surname>Jernite</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Perez-Beltrachini</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Shaikh</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">W</forename><surname>Xu</surname></persName>
		</editor>
		<meeting>the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2021), Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="34" to="46" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<monogr>
		<author>
			<persName><forename type="first">D</forename><surname>Adiwardana</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M.-T</forename><surname>Luong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">R</forename><surname>So</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hall</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Fiedel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Thoppilan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kulshreshtha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Nemade</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><forename type="middle">V</forename><surname>Le</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2001.09977</idno>
		<title level="m">Towards a human-like open-domain chatbot</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<author>
			<persName><forename type="first">R</forename><surname>Thoppilan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">De</forename><surname>Freitas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hall</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Shazeer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kulshreshtha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H.-T</forename><surname>Cheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Jin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Bos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Baker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Du</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2201.08239</idno>
		<title level="m">LaMDA: Language Models for Dialog Applications</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Ubar: Towards fully endto-end task-oriented dialog system with gpt-2</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Quan</surname></persName>
		</author>
		<ptr target="https://ojs.aaai.org/index.php/AAAI/article/view/17674" />
	</analytic>
	<monogr>
		<title level="m">Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021</title>
				<imprint>
			<publisher>AAAI Press</publisher>
			<date type="published" when="2021">February 2-9, 2021. 2021</date>
			<biblScope unit="page" from="14230" to="14238" />
		</imprint>
	</monogr>
	<note>The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event</note>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Multi-task pre-training for plug-andplay task-oriented dialogue system</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Su</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Shu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Mansimov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gupta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Cai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y.-A</forename><surname>Lai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2022.acl-long.319</idno>
		<ptr target="https://aclanthology.org/2022.acl-long.319.doi:10.18653/v1/2022.acl-long.319" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics</title>
				<editor>
			<persName><forename type="first">S</forename><surname>Muresan</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Nakov</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Villavicencio</surname></persName>
		</editor>
		<meeting>the 60th Annual Meeting of the Association for Computational Linguistics<address><addrLine>Dublin, Ireland</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="4661" to="4676" />
		</imprint>
	</monogr>
	<note>: Long Papers), Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Mtld, vocd-d, and hd-d: A validation study of sophisticated approaches to lexical diversity assessment</title>
		<author>
			<persName><forename type="first">P</forename><surname>Mccarthy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Jarvis</surname></persName>
		</author>
		<idno type="DOI">10.3758/BRM.42.2.381</idno>
	</analytic>
	<monogr>
		<title level="j">Behavior research methods</title>
		<imprint>
			<biblScope unit="volume">42</biblScope>
			<biblScope unit="page" from="381" to="392" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
