<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="it">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Evaluating the MuMe Dialogue System with the IDIAL Protocol</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Aureliano</forename><surname>Porporato</surname></persName>
							<email>aureliano.porporato@unito.it</email>
							<affiliation key="aff0">
								<orgName type="institution">Università degli Studi di Torino</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Alessandro</forename><surname>Mazzei</surname></persName>
							<email>alessandro.mazzei@unito.it</email>
							<affiliation key="aff1">
								<orgName type="institution">Università degli Studi di Torino</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Rosa</forename><surname>Meo</surname></persName>
							<email>rosa.meo@unito.it</email>
							<affiliation key="aff2">
								<orgName type="institution">Università degli Studi di Torino</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Daniele</forename><forename type="middle">P</forename><surname>Radicioni</surname></persName>
							<email>daniele.radicioni@unito.it</email>
							<affiliation key="aff3">
								<orgName type="institution">Università degli Studi di Torino</orgName>
							</affiliation>
						</author>
						<title level="a" type="main">Evaluating the MuMe Dialogue System with the IDIAL Protocol</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">6B716B1CB4BD548715DC338C95FB30F0</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T22:35+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>English. In this paper we describe the implementation of the MuMe dialogue system, a task-based dialogue system for a car sharing service, and its evaluation through the IDIAL protocol. Finally we report some comments on this novel dialogue system evaluation method. 1</p><p>Italiano. In questo lavoro descriviamo l'implementazione del sistema di dialogo MuMe, realizzato per un sistema di car sharing, e la sua valutazione attraverso il protocollo IDIAL. Infine, offriamo alcuni commenti su questo nuovo metodo per la valutazione di sistemi di dialogo.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="it">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>The interest in dialogue systems is on the rise in the NLP community <ref type="bibr">(McTear et al., 2016)</ref>, under the strong demand for the introduction of a natural and effective user interaction in applications, like in the customer care domain <ref type="bibr" target="#b5">(Hu et al., 2018)</ref>. A related and central issue is the evaluation of such systems. In this setting, it is largely known that most evaluation metrics that come from machine translation and compare a model generated response to a single target response, exhibit a poor correlation with the human judgement <ref type="bibr" target="#b7">(Liu et al., 2016)</ref>.</p><p>In this paper we briefly illustrate a task-oriented dialogue system called MuMe (from "MUoversi MEglio", "travelling better" in English language), and examine how far the evaluation protocol IDIAL <ref type="bibr">(Cutugno et al., 2018)</ref> is helpful in its assessment. IDIAL is composed by a usability evaluation (done by a group of users) and by an evaluation of the robustness of the dialog model based on the linguistic variations of the successful interactions with the users. The application being tested is a prototype dialogue system that we developed for the reservation of electric vehicles in the context of a car sharing service. A user must be able to interact with the system, to specify when and where s/he wants to leave and which sort of vehicle is needed. While there are some services and frameworks dedicated to the development of machine-learning-based dialogue systems, like Google Dialogflow<ref type="foot" target="#foot_0">2</ref> or the open source Rasa<ref type="foot" target="#foot_1">3</ref> frameworks, the lack of Italian dialogue corpora in the specific domain of car sharing reservations (see, e.g., <ref type="bibr" target="#b11">Serban et al. (2018)</ref>) and the impossibility on our part to recruit a number of people large enough for the creation of such a corpus, forced us to choose a different solution: we developed a simpler and less data-reliant rule-based system, based on slot-filling semantics. Moreover, the decisions made by this kind of systems can be tracked throughout the computation, thereby resulting in the advantage of being quite explainable. This is a desirable feature, since it simplifies the debugging and the maintenance of the routines, and allows an easier extension of the system to meet additional requirements. This paper is mostly concerned with the evaluation of the MuMe system. The structure of the paper is as follows. After surveying on related work (Section 2), we briefly introduce the overall architecture and the main components of the MuMe dialogue system (see Section 3); we evaluate MuMe by using the IDIAL protocol, and employ MuMe experimentation as a case study for giving feedback on the IDIAL protocol itself (Section 4); finally, in the final Section we briefly recap the main contributions of the paper, and point to ongoing and future work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Related Work</head><p>The pioneering work of <ref type="bibr" target="#b2">(Bobrow et al., 1977)</ref> proposed the frame-based architecture that most of task-based dialogue systems implement. The basic idea is to abandon the demanding goal to have a genuine logic representation of the dialog meaning and adopt a simpler slot-filling semantics. In some sense, the event-entities representation of the modern neural-based dialogue system frameworks can be seen as an ultimate evolution of that simplification idea. <ref type="bibr" target="#b1">Aust et al. (1995)</ref> presented a rule-based system to some extents similar to ours in its purpose and structure, created for a train-seat reservation project. This system has to grasp the names of cities, train stations, dates and times, and it is able to perform quite sophisticated temporal information processing. Further rule-based systems are reviewed in the survey by <ref type="bibr" target="#b0">(Abdul-Kader and Woods, 2015)</ref>.</p><p>A different class of dialogue systems are based on neural networks. A survey on this class of systems can be found in <ref type="bibr" target="#b8">(Mathur and Singh, 2018)</ref>.</p><p>Regarding the evaluation of dialogue systems, the work by <ref type="bibr" target="#b3">(Bohlin et al., 1999)</ref> proposes the Trindi Tick-list, a wish list of the desired dialogue behaviour and features specified as a checklist of "yes-no" questions. As regards this approach, Braunger and Maier (2017) argue that standardised evaluation models do not enable a complete evaluation of a dialogue system. Rather, they suggest that such evaluation must take into account the natural flow of the interaction between the user and the system itself; such measure involves many language-and user-dependant factors, such as the length of the user utterances. Such principles were tested in human-computer vocal interactions occurring on board of vehicles. Further information on dialogue systems evaluation methods can be found in the survey by <ref type="bibr" target="#b5">Deriu et al. (2019)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">The MuMe system architecture</head><p>In Figure <ref type="figure" target="#fig_1">1</ref> we depicted the basic architecture of the MuME dialogue system. The information flow starts from a sentence typed by the user: this sentence is handled by the OpenDial system (see Section 3.1) which plays both the role of the dialogue manager and of the system orchestrator. So, the sentence is syntactically parsed and semantically analyzed by an IE module (see Section 3.2). At this point, the result of the processing is converted into a slot-filling form. When control returns to OpenDial, it generates an answer and returns it to the user on the basis of a dialogue control strategy (see Section 3.3).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">The OpenDial Dialogue Manager</head><p>The main component of our software architecture is the OpenDial open source framework for dialogue management <ref type="bibr" target="#b6">(Lison, 2015)</ref>. The system, that was designed for speech interaction, adopts the information state approach for modelling the state of the dialogue <ref type="bibr" target="#b13">(Traum and Larsson, 2003)</ref>, that is a collection of variables representing the actual state of the system. The transition between states, i.e. the change of the variables values, is governed by the activation of a set of "if-then-else" rules on input values as well as on the variation of some variables. Indeed, OpenDial uses these rules when it models the sub-tasks of user utterance understanding, the dialogue management and the response generation. Moreover the integration of the system with external tools is simple. We exploited this capability in MuMe since for language understanding we used a module based on an external parser (see below). Additionally, the OpenDial framework implements some statisticalbased techniques to deal with uncertainty. This is a way to learn interaction models from existing dialogues. This feature is particularly important for speech based dialogue systems where uncertain information arises from automatic speech recognition. However, at this stage of the MuMe project, we did not use this feature since we were working on written texts only.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Parsing and Information Extraction</head><p>In order to assign semantic roles to the entities in the dialogues, we decided to use a syntactic parser on the text inserted by the user.</p><p>As our main parsing module we used Tint (The Italian NLP Tool) (Palmero Aprosio and Moretti, 2016), a framework modeled on Stanford CoreNLP <ref type="bibr" target="#b7">(Manning et al., 2014)</ref>. Tint performs some fundamental processing of user utterances, such as dependency parsing, Named Entities Recognition and the extraction of Temporal Expressions. In particular, the tasks are executed by interfacing with external tools.</p><p>For the recognition of temporal expressions (such as dates and times), Tint integrates the services provided by HeidelTime <ref type="bibr" target="#b12">(Strötgen and Gertz, 2013)</ref>. HeidelTime allows the extraction of various sorts of temporal expressions in various languages, including the Italian language, and represents them in the standard TIMEX3 format.</p><p>For the treatment of geographic expressions, Tint is interfacing with the Nominatim wrapper. <ref type="foot" target="#foot_2">4</ref>However, this (free and open source) service performs poorly in geocoding (i.e., in searching the GPS coordinates of a given address). As a consequence we decided to use the Google Maps API<ref type="foot" target="#foot_3">5</ref> , which provides for better performances. Indeed, Maps offers an API for address autocomplete, once this information piece has been isolated from the rest of the sentence, and for geolocation (i.e., searching the coordinates of the user), too.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Dialogue Control Strategy</head><p>The simple control strategy implemented, that governs the moves of the dialogue, is based on the fulfillment of a number of mandatory slots in the domain-specific slot-filling semantics adopted for the car reservation domain.</p><p>In particular, the mandatory slots are the start date, the start time and start stall (which encodes the start position). Indeed, the simplest reservation in MuMe needs only of these pieces of information: a person reserves a standard car, starting at a specific time of a specific day from a specific stall, and will return the car in the same stall without the need to specify the return date and time.</p><p>However, more complex reservations need more information, that are encoded in the nonmandatory slots of end date, end time, end stall and vehicle type. For example, the user can choose between three types of vehicles, but if the kind of vehicle is not specified, the system assigns a default 'economy car' to the vehicle type slot.</p><p>The MuMe system adopts a mixed initiative for dialogue handling. Although the dialogue is overall system-driven, the user starts the conversation by possibly providing some initial information. A richer initial information is expected to result in a shorter dialogue interaction. Indeed, a design goal of the MuMe system is to produce a dialogue as short as possible. For this reason, also in the subsequent interactions, if the user gives various pieces of information in a single utterance, the system can extract all such information and is able to assign each filler to the corresponding slot, thus avoiding further unnecessary questions.</p><p>When the user begins the interaction with the MuMe system, the system replies with a welcome message, and with a general question aiming at encouraging the user to start the interaction in the most natural way.</p><p>In order to give more details on the control strategy, we consider now the following running example and its processing in MuMe (see Figure <ref type="figure" target="#fig_1">1</ref>): (it) "User: Ho bisogno di un'auto domani per :::::: andare in via Pessinetto" (en) "User: I need a car tomorrow to :: go in Pessinetto street" <ref type="foot" target="#foot_4">6</ref>The Information Extraction phase detects a date (through HeidelTime) and an address (extracted through a basic set of custom rules) in the user sentence. By means of other rules that check the shape of the dependency tree (obtained through Tint), date and address are labelled as start date and end address. Particularly relevant in this case is the verb "andare" ("to go"), that signals that the following address is where the user wants to arrive and not a starting point. In the post-processing phase some additional information can be inferred, like the value of the start address, left unspecified by the user: it can be selected by retrieving the GPS coordinates of the address by means of the Google Maps API. Once the user's current location has been identified, the nearest stall is selected as the start stall.</p><p>At the end of this processing, the system successfully filled the start address, start stall, end address, end stall and start date slots. Some mandatory slots are still left unfilled, such as the start time, so that the system will ask the user to provide the missing information. As a consequence, the response of the system will be a question selected from a fixed list based on unfilled slots: in this specific example, the system will continue asking for the departure time.</p><p>At the end of the filling-phase of the mandatoryslots, the systems gives the user the possibility to modify the request and to correct possible errors and misunderstandings. The slot-filling values will be sent to a dedicated server for the finalization of the reservation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Evaluation</head><p>In order to have a first preliminary evaluation of the MuMe system, we applied the Trindi Ticklist protocols, that is a set of "yes-no" questions concerning specific capabilities of the developed system <ref type="bibr" target="#b3">(Bohlin et al., 1999)</ref>. While this simple questionnaire is helpful in the development phase, since it is able to give a measure of the system limits, it is not suitable to completely evaluate the actual experience of the user. At this stage of development, the MuMe system has a Trindi score of six over twelve with respect to the (original) list. Among the six features not yet implemented, there are complex tasks, such as the management of the help and non-help sub-dialogues, dealing with negative information, and dealing with noisy input.</p><p>In the rest of the Section, we report the results obtained by applying the IDIAL evaluation protocol to the current version of the MuMe system, which is split in a questionnaire concerning the user experience (Section 4.1), and a number of stress tests concerning the linguistic robustness of the system (Section 4.2).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">IDIAL User Evaluation</head><p>A group of 5 subjects (3 males, 2 females, 19, 22, 25, 26 and 61 years old) were recruited for the evaluation task by personal invitation and without rewards. After a brief oral description of the domain and of the basic mechanisms of interaction with the system, each user was asked to generate 7 complete dialogues with the system in a controlled environment. We asked the users to simulate the process of reserving a car without other specific constraints.</p><p>In Table <ref type="table">1</ref> we report the ten questions of the IDIAL user test with the average score, obtained by using a Likert scale based on five points.<ref type="foot" target="#foot_5">7</ref> Note that the questions 3, 4, 7 and 10 have been designed to evaluate the effectiveness of the dialogue system, while questions 1 and 2 regard the system efficiency.<ref type="foot" target="#foot_6">8</ref> </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">IDIAL Stress Tests</head><p>The second evaluation stage in the IDIAL protocol consists in a set of linguistic stress tests. We selected 5 dialogues (one for each user) among those successfully completed<ref type="foot" target="#foot_7">9</ref> during the user evaluation stage. Following the IDIAL protocol, we modified one sentence in each dialogue, once for each test, as illustrated in <ref type="bibr">(Cutugno et al., 2018)</ref>, and repeated the dialogue with the modified sentence. The results are reported in Table <ref type="table" target="#tab_0">2</ref>.</p><p>Note that we could not perform three stress tests for distinct reasons. We could not perform the ST-8 test, regarding active-passive alternation, because the users almost always used intransitive verbs (like "andare" ["to go"] and "partire" ["to depart"]). We could not perform the ST-9 test, concerning adjective-noun alternation, since the users used a very few adjectives (like vehicle types modifiers "lussuosa" ["luxurious"]), and no adjectives have been used in a successful dialogue. Fi-N Sentence Evaluation 1 The system was efficient in accomplishing the task.</p><p>3.2 (0.45)</p><p>2</p><p>The system quickly provided all the information that the user needed.</p><p>3.6 (0.55)</p><formula xml:id="formula_0">3</formula><p>The system is easy to use. 3.6 (1.52) 4 The system is awkward when the user interacts with a non-standard or unexpected input.</p><p>2.8 (0.84)</p><formula xml:id="formula_1">5</formula><p>The user is satisfied by his/her experience.</p><p>3.0 (0.00)</p><p>6 The user would recommend the system.</p><p>3.2 (0.84)</p><p>7</p><p>The system has a fluent dialogue.</p><p>2.8 (0.84)</p><formula xml:id="formula_2">8</formula><p>The system is charming.</p><p>3.4 (0.90) 9 The user enjoyed the time s/he spent using the software.</p><p>3.8 (0.84)</p><formula xml:id="formula_3">10</formula><p>The system is flexible to the user's needs.</p><p>3.6 (0.55)</p><p>Table <ref type="table">1</ref>: IDIAL user ratings of their experience: the average scores are provided on a 1-5 Likert scale with standard deviation, in parentheses.</p><p>nally, we could not perform the ST-10 test, concerning anaphora resolution, since at the actual stage of development the system never asks the user to pick an answer from a set of options.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Discussion</head><p>With respect to the user evaluation test, a number of considerations arise from scores. The main issue pointed out by the users during the evaluation phase is the difficulty in grasping when and why the system misunderstood (or lost) some pieces of information, thereby resulting in a relatively poor evaluation score for the fluency of the system (average score of 2.8). The lack of feedback due to the too simple way we used to generate system responses has even worsened this problem, leading the user to repeat the same mistake more than once. The standard deviation of the evaluations given to question 3 shows the high subjectivity of the user experiences with the system, and points out the necessity to equip the system with some form of user model to account for the expectation of different kinds of users. 4 out of 5 users explicitly stated (in private conversations after the evaluation phase) that they expected longer interactions. Also, they expected to receive more questions by the system, challenging our assumption on the length of dialogues. However, two of the same users added that 7 interactions are enough to evaluate the system.</p><p>With respect to the evaluation of the stress tests, we can say that the sentences provided by the users during the interaction with the system, were often very short and scarcely usable from the viewpoint of the IDIAL stress tests (especially those concerned with lexical and syntactic aspects). Another source of problems are typos, in particular in expressions regarding time and addresses. While our system seems quite robust to this kind of errors (see the first 4 rows of Table <ref type="table" target="#tab_0">2</ref>), it is difficult to automatically deal with them without some domain specific knowledge on their occurrence and some correction strategies.</p><p>As a final note, we want to report some comments given by the users about the questionnaire. Two users expressed some doubts on the interpretation of question 8 and in general all of them found difficult to assign a meaningful evaluation to it. For example, some of the users interpreted the question as regarding the lack of a GUI, absent in our prototype. We think that the ambiguity of the sentence explains the slightly higher standard deviation for that question in respect to others. Other comments include the lack of diversity between some sentences (like questions 1 and 5, often judged as redundant), and the inade-quacy of this Likert scale to evaluate some questions, like 5 and 9: they consider a more subjective scale ("poco" ["few"] -"molto" ["a lot"]) more appropriate, perceiving the whole process as a single experience.</p><p>While the linguistic stress test can be a valuable tool for the improvement of the system, the questionnaire concerning the user experience should be revised for addressing some critics that we collected. In particular, the questionnaire should be augmented with more specific questions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Conclusion and Future Work</head><p>We presented the MuMe system, a prototype of a rule-based dialogue system and its evaluation through the IDIAL method.</p><p>Since the MuMe project is still in development, there is much room for improvement. The most pressing problem to be addressed in future development is the generation of a response more meaningful to the user. The application of a natural language generation pipeline for Italian (e.g. <ref type="bibr" target="#b10">(Mazzei et al., 2016;</ref><ref type="bibr" target="#b10">Mazzei, 2016;</ref><ref type="bibr" target="#b4">Conte et al., 2017;</ref><ref type="bibr" target="#b5">Ghezzi et al., 2018)</ref>) could help to these ends.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>1</head><label></label><figDesc>Copyright c 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: The schematic architecture of the MuMe dialogue system.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 2 :</head><label>2</label><figDesc>It is worth noting that IDIAL stress test results.</figDesc><table><row><cell>Stress Test</cell><cell>Passed</cell></row><row><cell>Spelling Substitutions</cell><cell></cell></row><row><cell>ST-1 Confused words</cell><cell>60%</cell></row><row><cell>ST-2 Misspelled words</cell><cell>40%</cell></row><row><cell>ST-3 Character replacement</cell><cell>80%</cell></row><row><cell>ST-4 Character swapping</cell><cell>60%</cell></row><row><cell>Lexical Substitutions</cell><cell></cell></row><row><cell>ST-5 Less frequent synonyms</cell><cell>60%</cell></row><row><cell>ST-6 Change of register</cell><cell>40%</cell></row><row><cell>ST-7 Coreference</cell><cell>100%</cell></row><row><cell>Syntactic Substitutions</cell><cell></cell></row><row><cell>ST-8 Active-Passive alternation</cell><cell>−</cell></row><row><cell>ST-9 Nouns-adjectives inversion</cell><cell>−</cell></row><row><cell>ST-10 Anaphora resolution</cell><cell>0%</cell></row><row><cell>ST-11 Verbal-modifier inversion</cell><cell>80%</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_0">https://dialogflow.com/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_1">https://rasa.com/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_2">http://nominatim.org/.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_3">https://cloud.google.com/ maps-platform/.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_4">The English version of the user and system sentences are given for clarity. The system is available in Italian language only.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_5">We used the Italian version of the questionnaire, found in the Appendix A of https://tinyurl.com/ yxngqkx4, but for sake of readability in Table1we report the English version.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_6">The answers of each subjects are available at https: //tinyurl.com/y6nruwon</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="9" xml:id="foot_7">We considered an interaction as 'successfully completed' if the system recognized and processed correctly all the data given by the user.</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>This project has been partially supported by the MuMe Project (Muoversi Meglio), funded by the Piedmont Region and EU in the frame of the F.E.S.R. 2014/2020.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Survey on chatbot design techniques in speech conversation systems</title>
		<author>
			<persName><forename type="first">Abdul-Kader</forename></persName>
		</author>
		<author>
			<persName><forename type="first">Woods2015 ; Sameera A</forename><surname>Abdul-Kader</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">C</forename><surname>Woods</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal of Advanced Computer Science and Applications</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="issue">7</biblScope>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">The philips automatic train timetable information system</title>
		<author>
			<persName><surname>Aust</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Speech Communication</title>
		<imprint>
			<biblScope unit="volume">17</biblScope>
			<biblScope unit="issue">3-4</biblScope>
			<biblScope unit="page" from="249" to="262" />
			<date type="published" when="1995">1995. 1995</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Gus, a frame-driven dialog system</title>
		<author>
			<persName><surname>Bobrow</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Artif. Intell</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="155" to="173" />
			<date type="published" when="1977-04">1977. 1977. April</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Natural language input for in-car spoken dialog systems: How natural is natural</title>
		<author>
			<persName><surname>Bohlin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue</title>
				<meeting>the 18th Annual SIGdial Meeting on Discourse and Dialogue</meeting>
		<imprint>
			<date type="published" when="1999">1999. 1999. 2017</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="137" to="146" />
		</imprint>
	</monogr>
	<note>Survey of existing interactive systems</note>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Dealing with italian adjectives in noun phrase: a study oriented to natural language generation</title>
		<author>
			<persName><surname>Conte</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Fourth Italian Conference on Computational Linguistics (CLiC-it 2017)</title>
				<editor>
			<persName><forename type="first">Francesco</forename><surname>Cutugno</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Maria</forename></persName>
		</editor>
		<meeting>the Fourth Italian Conference on Computational Linguistics (CLiC-it 2017)<address><addrLine>Rome, Italy</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2017-12-11">2017. 2017. December 11-13, 2017. December. 2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Auxiliary selection in italian intransitive verbs: A computational investigation based on annotated corpora</title>
		<author>
			<persName><forename type="first">Di</forename><surname>Maro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sara</forename><surname>Falcone</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Marco</forename><surname>Guerini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Bernardo</forename><surname>Magnini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Antonio</forename><surname>Origlia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">;</forename><surname>Deriu</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1905.04071</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Fifth Italian Conference on Computational Linguistics (CLiC-it 2018)</title>
				<meeting>the Fifth Italian Conference on Computational Linguistics (CLiC-it 2018)<address><addrLine>Berlin</addrLine></address></meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2018">2018. 2019. 2019. 2018. 2018. 2018</date>
			<biblScope unit="page">415</biblScope>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
	<note>Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems</note>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">A hybrid approach to dialogue management based on probabilistic rules</title>
		<author>
			<persName><forename type="first">Pierre</forename><surname>Lison</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computer Speech &amp; Language</title>
		<imprint>
			<biblScope unit="volume">34</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="232" to="255" />
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">How not to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation</title>
		<author>
			<persName><surname>Liu</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1603.08023</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations</title>
				<meeting>52nd annual meeting of the association for computational linguistics: system demonstrations</meeting>
		<imprint>
			<date type="published" when="2014">2016. 2016. 2014. 2014</date>
			<biblScope unit="page" from="55" to="60" />
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
	<note>The stanford corenlp natural language processing toolkit</note>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<author>
			<persName><forename type="first">Singh2018</forename><surname>Mathur</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Vinayak</forename><surname>Mathur</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Arpit</forename><surname>Singh</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1803.08419</idno>
		<title level="m">The rapidly changing landscape of conversational agents</title>
				<editor>
			<persName><forename type="first">Alessandro</forename><surname>Mazzei</surname></persName>
		</editor>
		<meeting><address><addrLine>Cristina</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2016">2018. 2016</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Simplenlg-it: adapting simplenlg to italian</title>
		<author>
			<persName><forename type="first">Cristina</forename><surname>Battaglino</surname></persName>
		</author>
		<author>
			<persName><surname>Bosco</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 9th International Natural Language Generation conference</title>
				<meeting>the 9th International Natural Language Generation conference<address><addrLine>Edinburgh, UK</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2008">2016. September 5-8</date>
			<biblScope unit="page" from="184" to="192" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Building a computational lexicon by using SQL</title>
		<author>
			<persName><forename type="first">Alessandro</forename><surname>Mazzei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">;</forename></persName>
		</author>
		<author>
			<persName><forename type="first">Zoraida</forename><surname>Callejas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">David</forename><surname>Griol</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of Third Italian Conference on Computational Linguistics (CLiC-it 2016) &amp; Fifth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2016)</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Palmero Aprosio</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><surname>Moretti</surname></persName>
		</editor>
		<meeting>Third Italian Conference on Computational Linguistics (CLiC-it 2016) &amp; Fifth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2016)<address><addrLine>Napoli, Italy</addrLine></address></meeting>
		<imprint>
			<publisher>Springer Publishing Company, Incorporated</publisher>
			<date type="published" when="2016-09">2016. December 5-7, 2016. 2016. 2016. 2016. September</date>
			<biblScope unit="volume">1749</biblScope>
			<biblScope unit="page" from="1" to="5" />
		</imprint>
	</monogr>
	<note>Italy goes to Stanford: a collection of CoreNLP modules for Italian. ArXiv e-prints</note>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">A survey of available corpora for building data-driven dialogue systems: The journal version</title>
		<author>
			<persName><surname>Serban</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Dialogue &amp; Discourse</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="1" to="49" />
			<date type="published" when="2018">2018. 2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Multilingual and cross-domain temporal tagging</title>
		<author>
			<persName><forename type="first">Gertz2013</forename><surname>Strötgen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jannik</forename><surname>Strötgen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Michael</forename><surname>Gertz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Language Resources and Evaluation</title>
		<imprint>
			<biblScope unit="volume">47</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="269" to="298" />
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">The Information State Approach to Dialogue Management</title>
		<author>
			<persName><forename type="first">Larsson2003</forename><surname>Traum</surname></persName>
		</author>
		<author>
			<persName><forename type="first">David</forename><surname>Traum</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Staffan</forename><surname>Larsson</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Current and New Directions in Discourse and Dialogue</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2003">2003</date>
			<biblScope unit="page" from="325" to="353" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
