<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Irene Sucameli Dipartimento di Informatica</string-name>
          <email>irene.sucameli@phd.unipi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Alessandro Lenci Dipartimento di Filologia, Letteratura, Linguistica Universita` di Pisa</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Bernardo Magnini Fondazione Bruno Kessler Trento</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Manuela Speranza Fondazione Bruno Kessler Trento</institution>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Maria Simi Dipartimento di Informatica Universita` di Pisa</institution>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Universita` di Pisa</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>English. The difficulty in finding useful dialogic data to train a conversational agent is an open issue even nowadays, when chatbots and spoken dialogue systems are widely used. For this reason we decided to build JILDA, a novel data collection of chat-based dialogues, produced by Italian native speakers and related to the job-offer domain. JILDA is the first dialogue collection related to this domain for the Italian language. Because of its collection modalities, we believe that JILDA can be a useful resource not only for the Italian research community, but also for the international one.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Italiano. Negli ultimi anni l’utilizzo di
chatbot e sistemi dialogici e` diventato
sempre piu` comune; tuttavia, il
reperimento di dati di apprendimento adeguati
per addestrare agenti conversazionali
costituisce ancora una questione irrisolta.
Per questo motivo abbiamo deciso di
produrre JILDA, un nuovo dataset di dialoghi
relativi al dominio della ricerca del
lavoro e realizzati via chat da parlanti
nativi italiani. JILDA costituisce la prima
collezione di dialoghi relativi a questo
dominio, in lingua italiana. Per gli
aspetti metodologici e la modalita` di
raccolta dei dati, riteniamo che una simile
risorsa possa essere utile ed interessante
non solo per la comunita` di ricerca
italiana ma anche per quella internazionale.</p>
    </sec>
    <sec id="sec-2">
      <title>1 Introduction</title>
      <p>Chatbots and spoken dialogue systems are now
widespread; however, there is still a main issue
connected to their development: the availability of
training data. Finding useful data to train a
system to interact as human-like as possible is not
a trivial task. This problem is even more critical
for the Italian language, where only few datasets
are available. To supplement this deficiency of
data, we decided to develop JILDA (Job Interview
Labelled Dialogues Assembly), a new collections
of chat-based mixed-initiative, human-human
dialogues related to the job offer domain. Our work
offers different elements of novelty. First of all, it
constitutes, to the best of our knowledge, the first
dialogue collection for this domain for the Italian
language. Moreover, our dataset was not built
using a Wizard of Oz approach, usually adopted in
the realization of dialogues. Instead, we used an
approach similar to the Map Task one, as we will
describe in the next section. This allowed us to
obtain more complex, mixed-initiative dialogues.
2</p>
    </sec>
    <sec id="sec-3">
      <title>Background</title>
      <p>
        Few dialogic datasets are available for Italian,
including the NESPOLE dialogues related to the
tourism domain
        <xref ref-type="bibr" rid="ref13">(Mana, 2004)</xref>
        , QA datasets
related to the movie or the customer care domains
        <xref ref-type="bibr" rid="ref2">(Bentivogli, 2014)</xref>
        , and a recent dataset derived
from the translation of the English SNIPS
        <xref ref-type="bibr" rid="ref6">(Castellucci, 2019)</xref>
        . However, the resources currently
available are still limited and, to the best of our
knowledge, none of the existing ones is related to
the domain of job-offer. For what concerns the
English language, although there are more dialogic
resources that can be used to train conversational
agents
        <xref ref-type="bibr" rid="ref10 ref11 ref20 ref9">(Lowe, 2015; Yu, 2015; El Asri, 2017;
Budzianowski, 2018; Li, 2018)</xref>
        , as far as we
know there are no relevant and freely accessible
datasets related to job-matching. Moreover, these
      </p>
      <p>
        Copyright c 2020 for this paper by its authors. Use
permitted under Creative Commons License Attribution 4.0
International (CC BY 4.0).
datasets usually record simplified conversations,
which do not represent the effective complexity
that characterises human-human interactions. To
fill this gap, we decided to produce a new
dialogic dataset for the job domain, for the Italian
language. To collect data representative of the
linguistic naturalness of native speakers, we had to
detect the best approach to fulfil our aim.
The WoZ approach. One of the common
approaches used to build full-scale datasets is
Wizard of Oz (WoZ) (Kelley, 1984), where a
human (the wizard) covers the role of the computer
within a simulated human-computer conversation.
The other participants in the conversation,
however, are not aware that they are talking to a
human rather than a conversational system
        <xref ref-type="bibr" rid="ref14">(Rieser,
2008)</xref>
        . This method has pros and cons: it may
allow to collect conversations written in natural
language in a short time
        <xref ref-type="bibr" rid="ref18">(Wen, 2017)</xref>
        ; however,
the dialogues built in this way may not record the
noisy conditions experienced in real conversations
(e.g. repetitions, errors) and do not show much
variation from the syntactic and semantic point of
view (Budzianowski, 2018). Due to the
limitations of WoZ, we decided to adopt other methods
to build our dataset. The first method used in an
initial phase of experimentation, was the
templatebased approach.
      </p>
      <p>
        The template-based approach. In this solution,
it is asked to a volunteer to paraphrase template
dialogues using natural language in order to
create a simulated dialogue
        <xref ref-type="bibr" rid="ref15">(Shah, 2018)</xref>
        . We
experienced this modality during an initial
experimental phase, in which we used templates for creating
task-oriented dialogues. In this first experiment,
as previously done by Shah et al.
        <xref ref-type="bibr" rid="ref15">(Shah, 2018)</xref>
        ,
we used Amazon Mechanical Turk1 and we asked
Italian native speakers to cover the role of both
the computer and the user, paraphrasing templates
of dialogues between a recruiter and a job seeker.
We proposed three different templates, with
1520 recruiter-user interactions each and, to ensure
greater lexical variety, we inserted some random
variables into the templates (for example, user’s
skills and the type of job requested). With this
experimental set up, we built a first dataset of 220
dialogues. However, despite the attempts to ensure
linguistic variety, we noticed that in the MTurk
dataset the conversation was strongly guided by
1Available here: https://www.mturk.com/
the templates provided and that the dialogues were
little diversified from a lexical point of view.
The Map Task approach. To overcome the
limits of the WoZ and of the template-based
approach, and to produce a set of mixed-initiative
dialogues which reflect the naturalness typical of
human-human interaction, we decided to
organise a new experiment. In this second phase of
experimentation, we used as guideline the
methodology adopted for the Map Task experiment
        <xref ref-type="bibr" rid="ref3">(Brown,
1984)</xref>
        , in which two participants collaborate to
achieve a common purpose. For example,
Anderson et al. adopted the Map Task to build the
HCRC Corpus
        <xref ref-type="bibr" rid="ref1">(Anderson, 1991)</xref>
        , a corpus of
dialogue recordings and transcriptions. Realized in
a similar way, but for the italian language, there
is the CLIPS2 corpus, a dataset containing speech
recordings.
      </p>
      <p>In Anderson’s Map Task, one speaker (the
Instruction Giver) has a route marked on the map
while the other speaker (the Instruction Follower)
has the map without the route and, talking with
the Instruction Giver, has to reproduce the route.
However, the two maps are not identical and the
participants have to discover how they differ.</p>
      <p>In our experiment, the two parts involved had to
collaborate in a conversation to find the best match
between job-offer and candidate profile. The
participants covered the role of the navigator3, who
had a set of possible job offers, and of the
applicant, who was provided with a job profile to
impersonate (a short CV). While in the HCRC Map
Task the two parts had to interact in order to figure
out the route on the blind map, in this case the two
participants had to chat to find the best job-offer
match possible for both parts. In the next section,
both the framework and the set up of our
experiment are described in detail.
3</p>
    </sec>
    <sec id="sec-4">
      <title>Experimental setup</title>
      <p>To create the JILDA dialogues collection for
joboffer, we asked 50 Italian native speakers to
simulate a conversation between a ”navigator” and an
applicant. At the end of the experiment, all the
volunteers received an economical reward for their
participation. We randomly assigned to 25
volun2Available here: http://www.clips.unina.it/
it/corpus.jsp</p>
      <p>3The navigator plays a role similar to the recruiter’s one,
who is in charge of reviewing candidate’s skills and past
experiences in order to find a suitable job.
teers the role of navigator, providing 5 job offers
each. The other 25 volunteers had to pretend to be
applicants and describe themselves on the basis of
the information contained in a curriculum we
provided. The navigators’ goal was to help applicants
to find a job offer (among the offers available) best
suited to their curriculum and interests by asking
questions. Applicants, on the other side, had to
interact with the navigator describing the skills and
competencies included in their curricula.</p>
      <p>Similarly to the Map Task framework, the two
parties had to collaborate in order to reach their
goal and were engaged in creating a mixed
initiative spontaneous dialogue without a strict
guidance. Navigators and applicants were free to lead
the conversation as they preferred; in fact, we did
not use any dialogue template (although we
provided some examples) and both applicants and
navigators were allowed to ask questions to their
interlocutor, in order to reach the best possible
match between applicant’s needs and the job
offers available to the navigator. The only
compulsory requirements we imposed to participants was
to converse only about topics related to the
experiment. In addition to this, we provided as guideline
an indicative length of 15/20 (overall) utterances
per dialogue.</p>
      <p>Both navigators and applicants were not
allowed to interact with the same interlocutor twice.
Each navigator interacted with 21 different
applicants and, in a similar way, each applicant had to
interact with 21 navigators. With this strategy we
wanted not only to obtain dialogues as
linguistically diversified as possible, but also to ensure that
navigators with different offers interacted with
applicants with different curricula and needs.</p>
      <p>To make the navigator interact with the
applicant, we used the Slack platform4, which allowed
the volunteers to interact with each other in an
easy way, maintaining anonymity through the use
of nicknames. Moreover, it allowed us to
monitor multiple conversations at the same time and
to easily download the dialogues’ output in a json
format suitable for the future annotations. Neither
the applicants nor the navigators knew with whom
they had to chat.</p>
      <p>We asked the volunteers to realise 21 chat-based
dialogues distributed in five days, so they had to
produce 4 or 5 dialogues per day.</p>
      <p>4Available at https://slack.com/intl/en-it/</p>
    </sec>
    <sec id="sec-5">
      <title>Results and Discussion</title>
      <p>At the end of the experiment, we collected 525
chat-based, mixed initiative dialogues 5. In order
to have a first evaluation of the data produced, we
asked our volunteers to assess the quality of the
dialogues. More specifically, we asked to
evaluate the degree of naturalness, the linguistic
variety of the dialogues (Table 1), and the difficulties
detected in the experiment (Table 2). Among the
50 participants, 29 completed the evaluation
questionnaire. The results obtained are reported below.</p>
      <sec id="sec-5-1">
        <title>Rating Scale</title>
        <p>1 (very low)
2
3
4
5 (very high)</p>
      </sec>
      <sec id="sec-5-2">
        <title>Realism</title>
        <p>0%
7%
14%
62%
17%</p>
      </sec>
      <sec id="sec-5-3">
        <title>Linguistic variety</title>
        <p>0%
14%
55%
21%
10%</p>
        <p>The volunteers’ evaluation is in line with what can
be observed directly from the dialogues. In fact,
from a preliminary analysis, the dialogues
produced exhibit a good linguistic variety and capture
complex phenomena of the Italian language, such
as co-reference. Since they are task oriented
dialogues, the data follow a certain pattern of
questions/answers but, within this common structure,
the navigator-applicant interaction varies in an
extremely interesting way. For instance, we noticed
the presence of asynchronous messages with
respect to the context, as shown in the example
reported in Appendix A. This is due to the fact that
users have the tendency to type fast while they are
chatting, and this may lead to overlapping
messages, were the answer to a question is not
immediate but comes in a later turn. Furthermore,
5Both JILDA and MTurk datasets are available here:
http://dialogo.di.unipi.it/jilda/
applicants do not passively answer to navigators
but they often take the initiative, formulating
questions and proactively giving unsolicited
information. Comparing JILDA’s dialogues with MTurk’s
ones, it is clear that JILDA’s dialogues are more
complex and semantically diversified.</p>
        <p>
          # dialogues
avg turns/dialogue
# tokens
# sentences
# utterances
# types
# lemmas
type/token ratio
lemma/token ratio
avg length sentences
avg length utterances
# proactive/intent
# proactive/sentences
A first analysis, for which we also used
ProfilingUD
          <xref ref-type="bibr" rid="ref4">(Brunato, 2020)</xref>
          and UDPipe
          <xref ref-type="bibr" rid="ref17">(Straka, 2017)</xref>
          ,
highlights differences of the new dataset with
respect to the previous one 6 such as:
lexical variability. As shown in Tab.3,
JILDA has a greater lexical variability, which
is extremely useful if the dataset is used to
train new models. In fact, considering the
whole dataset, JILDA has more tokens and
types. Even more importantly, by selecting
subsets of JILDA with the same number of
tokens as MTurk, it is possible to verify that,
on the average, JILDA’s lexical richness is
higher (see the lemma and type/token ratio).
syntactic complexity. With respect to the
MTurk dataset, JILDA includes more
subordinates and longer chains of
dependencies, which is an indication of more complex
sentences. In fact, the analysis conducted
with Profiling-UD
          <xref ref-type="bibr" rid="ref4">(Brunato, 2020)</xref>
          shows
6It is worth to highlight that the differences between the
two resources are primarily related to the methods used for
data collection and not to the platforms used.
for JILDA a higher percentage of
subordinate propositions (51.46% against 39.87% in
MTurk) and longer chains of embedded
subordinate clauses (18.35% of the chains are
long 2 or more in JILDA, 12.48% in MTurk).
dialogue naturalness. The naturalness of
JILDA’s dialogues partially emerged in the
first evaluation conducted with the
participants in the experiment (Table 1-2). In
addition to this, Table 3 shows that JILDA
contains a high number of proactiveness
phenomena, which are significant in highlighting
the complexity of a dialogue and its
collaborative nature. In particular, JILDA contains
a higher number of proactive intents, both
in terms of percentage over the total
number of intents and over the number of
sentences. 7 This shows that our volunteers did
not merely answer their interlocutor by
providing the strictly required information, but
rather on their own initiative provided
additional information, which made the dialogues
more natural and complex.
        </p>
        <p>
          The annotation of the dialogues is now in progress
in order to offer to the scientific community not
only a new set of dialogues for the Italian
language but also, and above all, a richly annotated
dataset. The annotation will take as a basis the
notation of Multiwoz, which is becoming a
standard in dialogue datasets (Budzianowski, 2018).
However, although in Multiwoz only user’s turns
are annotated, we decided to annotate both
applicant’s and navigator’s utterances, since we noticed
that both utterances convey important and useful
information. The preliminary analysis of the data
presented here will be deepened once the
annotation is complete. To support the annotation work
of the JILDA dataset, we modified an open source
dialogue annotation tool, LIDA, in collaboration
with its developers
          <xref ref-type="bibr" rid="ref7">(Collins, 2019)</xref>
          . Specifically,
we extended this tool to 1) allow support for
multiple annotators working at the same project, 2)
manage multiple annotation styles and metadata
information, 3) manage different collections of
dialogues and 4) simplify the annotation interface,
improving the user experience. Both the new
release of the LIDA Multi-user annotation tool and
the JILDA annotated dataset will be made
available to the scientific community.
        </p>
        <p>7Proactive intents were explicitly annotated for this count.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>In this paper we presented JILDA, a novel dataset
of chat-based, mixed-initiative dialogues built for
the Italian language and related to the job-offer
domain. This new resource has been built adopting
an experimental approach based on the Map Task
experiment. This has allowed us to collect
mixedinitiative data which represent effectively the
naturalness which is typical in the human-human
interaction. The JILDA dataset, which includes 525
dialogues, is in the process of being completely
annotated with dialogue acts and entities related
to this specific domain. For the annotation of
those dialogues we are using our own extension of
LIDA. The annotated dialogues will then be used
to train a conversational agent. Thanks to this new
resource, our goal is to allow an agent chat with
the user in a natural and human-like way.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>This work has been endorsed by AILC (Italian
Association for Computational Linguistics).
We thank Carla Congiu, Clara Casandra and
Davide Cucurnia, students of Digital Humanities at
the University of Pisa, for annotation work on
JILDA and for contributing to the development of
the annotation tool.
Example of asynchronous message in JILDA
Navigator: Cercano persone che si occupino sia di
gestire la comunicazione pubblicitaria del cliente
attraverso il web, che di interagire direttamente con la
clientela.</p>
      <p>Applicant: Quanto tempo dura il periodo di
formazione?
Navigator: Questo significa che abilita` di
comunicazione sono essenziali in questo lavoro
Applicant:
Navigator: L’annuncio non fornisce informazioni
circa la durata del contratto, mi dispiace</p>
    </sec>
    <sec id="sec-8">
      <title>Appendix B</title>
      <sec id="sec-8-1">
        <title>Example of dialogue from Mturk</title>
        <p>sys: Salve e benvenuto alla Recruiter Top, io sono
Tony.
usr: Buongiorno Tony, mi chiamo Giorgio e sono alla
ricerca di un lavoro come traduttore.
sys: Bene, mi dica qualcosa in piu` su di lei;
attualmente lavora o studia? e quali sono le sue competenze?
usr: Mi sono appena laureato in lingue e letterature
straniere, nello specifico con conoscenza di inglese,
spagnolo e francese
sys: E per quanto riguarda esperienze lavorative?
usr: Ho lavorato 2 anni in una casa editrice
sys: in che senso ha lavorato per 2 anni in una casa
editrice?
usr: Sono stato viceredattore per una casa editrice
locale.
sys: Ok, e per quello che riguarda le lingue straniere?
essendo laureato in lingue...
usr: bhe si..ovviamente ho una buona conoscenza di
francese , inglese e spagnolo
sys: Che contratto e tipologia di azienda sta cercando
o valuterebbe?
usr: Propenderei per un tempo determinato in una
azienda all’estero
sys: al momento si cerca per la sede di Gais figura
professionale che si occupi della corrispondenza
telefonica e scritta con i nostri clienti spagnoli e portoghesi e
di organizzare eventi di marketing, potrebbe essere
interessato?
usr: Si, mi dia i dettagli e lo valutero`. Grazie e
arrivederci
sys: Arrivederci e buona fortuna.</p>
      </sec>
      <sec id="sec-8-2">
        <title>Example of dialogue from JILDA</title>
        <p>sys: Ciao, sono il tuo Navigator di oggi, mi chiamo
Mattia. Posso aiutarti in qualche modo?
usr: Buongiorno Mattia, mi chiamo Valentina e sto
cercando un lavoro a tempo determinato.
sys: Ciao Valentina, puoi dirmi qualcosa in piu` sugli
studi che hai fatto?
usr: Certamente! Mi sono laureata tre anni fa in
Lingue e Letterature straniere.
sys: Ottimo, hai gia` avuto esperienza lavorativa in
passato o sarebbe il tuo primo lavoro?
usr: Ho gia` avuto un’esperienza lavorativa, perche´ per
due anni ho lavorato come guida museale.
sys: Ti e` mai capitato di lavorare a progetti con
bambini, durante questi due anni?
usr: Quando lavoravo per il museo non ho mai
affrontato dei progetti specifici riguardanti i bambini. Ho
pero` fatto da guida a delle scolaresche.
sys: Ho qui un annuncio riguardo la possibilita` di
fare assistenza scolastica a minori con disabilita`, dalle
scuole d’infanzia alle superiori. Pensi che ti
piacerebbe provare qualcosa del genere?
usr: Sarebbe un’esperienza interessante, ma non credo
di avere le competenze necessarie. Preferirei rimanere
nel campo dei musei o, in generale, in quello dei luoghi
turistici.
sys: Al momento non ho annunci per posti
disponibili in campo turistico o museale, mi dispiace. Data
la tua laurea in Lingue pero`, vorrei proporti un
annuncio di CHANEL Cordination S.r.l., sono alla ricerca di
una stagista da affinacare alla Responsabile Qualita`
Prodotto referente per l’Italia.
usr:
sys: dovresti occuparti principalmente di
Monitoraggio del database dei prodotti delle collezioni.
Gestione dei contatti con i fornitori locali ed
esteri.Archiviazione e consultazione dei Test di
laboratorio e supporto della responsabile nella preparazione
di presentazioni in PPT e nelle traduzioni della
reportistica nelle lingue in inglese e francese
usr: Mi interesserebbe molto. Dove si trova l’azienda?
sys: La sede dell’azienda e` a Milano, quindi
probabilmente dovrai spostarti l`ı se non abiti gia` in zona,
usr: Non sarebbe un problema spostarmi. Il lavoro e´ a
tempo pieno o a tempo parziale?
sys: Non e` specificato nell’annuncio, so solo che si
tratta di un tirocinio/stage. Probabilmente e` un cosa
da discutere in fase di colloquio direttamente con loro
usr: Ok grazie.
sys: Puoi contattare direttamente l’azienda a questo
indirizzo e-mail info@azienda.com
usr: Perfetto, grazie mille! :)
sys: Figurati, buona fortuna per il lavoro!
usr: Grazie, buona giornata! :)</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Anderson</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bader</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bard</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boyle</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Doherty</surname>
            ,
            <given-names>G. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garrod</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Isard</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kowtko</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McAllister</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miller</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sotillo</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thompson</surname>
            ,
            <given-names>H. S.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Weinert</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <year>1991</year>
          .
          <source>The HCRC Map Task Corpus. Language and Speech</source>
          ,
          <volume>34</volume>
          , pp.
          <fpage>351</fpage>
          -
          <lpage>366</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Bentivogli</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Magnini</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <year>2014</year>
          .
          <article-title>An Italian Dataset of Textual Entailment Graphs for Text Exploration of Customer Interactions</article-title>
          .
          <source>In Proceedings of the first Italian Computational Linguistics Conference.</source>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Brown</surname>
          </string-name>
          , G.,
          <string-name>
            <surname>Anderson</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yule</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shillcock</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <year>1984</year>
          .
          <article-title>Teaching talk: Strategies for production and assessment</article-title>
          . Cambridge University Press.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Brunato D.</given-names>
            ,
            <surname>Cimino</surname>
          </string-name>
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Dell'Orletta F</surname>
          </string-name>
          .,
          <string-name>
            <surname>Montemagni</surname>
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Venturi</surname>
            <given-names>G.</given-names>
          </string-name>
          ,
          <year>2020</year>
          .
          <article-title>Profiling-UD: a Tool for Linguistic Profiling of Texts”</article-title>
          .
          <source>In Proceedings of 12th Edition of International Conference on Language Resources and Evaluation (LREC</source>
          <year>2020</year>
          ), pp.
          <fpage>11</fpage>
          -
          <issue>16</issue>
          <year>May</year>
          ,
          <year>2020</year>
          , Marseille, France.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Casanueva</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ultes</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ramadan</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          , Gasˇic´,
          <string-name>
            <surname>M.</surname>
          </string-name>
          ,
          <year>2018</year>
          . MultiWOZ - A
          <string-name>
            <surname>Large-</surname>
          </string-name>
          <article-title>Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling</article-title>
          .
          <source>In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP)</source>
          , pp.
          <fpage>5016</fpage>
          -
          <lpage>5026</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Castellucci</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bellomaria</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Favalli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Romagnoli</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,,
          <year>2019</year>
          ,
          <article-title>Multi-lingual Intent Detection and Slot Filling in a Joint BERT-based Model</article-title>
          . In ArXiv abs/
          <year>1907</year>
          .02884.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Collins</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rozanov</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <year>2019</year>
          LIDA:
          <article-title>Lightweight Interactive Dialogue Annotator</article-title>
          .
          <source>In Proceedings of the 2019 EMNLP and the 9th IJCNLP (System Demonstrations)</source>
          , pp.
          <fpage>121</fpage>
          -
          <lpage>126</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Dell'Orletta</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montemagni</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Venturi</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <year>2011</year>
          READ-IT:
          <article-title>assessing readability of Italian texts with a view to text simplification</article-title>
          .
          <source>In Proceedings of the Second Workshop on Speech and Language Processing for Assistive Technologies (SLPAT</source>
          <year>2011</year>
          ),
          <article-title>Association for Computational Linguistics</article-title>
          , pp.
          <fpage>73</fpage>
          -
          <lpage>83</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>El</given-names>
            <surname>Asri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Schulz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Zumer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Harris</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Fine</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Mehrotra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Suleman</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.</surname>
          </string-name>
          <year>2017</year>
          .
          <article-title>Frames: A Corpus for Adding Memory to Goal-Oriented Dialogue Systems</article-title>
          . In arXiv:
          <volume>1704</volume>
          .00057 Kelley,
          <string-name>
            <surname>J.F.</surname>
          </string-name>
          <year>1984</year>
          .
          <article-title>An iterative design methodology for user-friendly natural language office information applications</article-title>
          .
          <source>In ACM Transactions on Information Systems (TOIS)</source>
          ,
          <volume>2</volume>
          (
          <issue>1</issue>
          ), pp.
          <fpage>26</fpage>
          -
          <lpage>41</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kahou</surname>
            ,
            <given-names>S.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schulz</surname>
          </string-name>
          , H., Michalski, V.,
          <string-name>
            <surname>Charlin</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pal</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <year>2018</year>
          .
          <article-title>Towards Deep Conversational Recommendations</article-title>
          ,
          <source>In Advances in Neural Information Processing Systems 31 (NIPS</source>
          <year>2018</year>
          ), pp.
          <fpage>9748</fpage>
          -
          <lpage>9758</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Lowe</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pow</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Serban</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Pineau</surname>
            <given-names>J.</given-names>
          </string-name>
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <source>In Proceedings of the SIGDIAL 2015 Conference</source>
          , pp.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Mana</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cattoni</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pianta</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rossi</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pianesi</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Burger</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <year>2004</year>
          .
          <article-title>The Italian NESPOLE! Corpus: a Multilingual Database with Interlingua Annotation in Tourism and Medical Domains</article-title>
          .
          <source>In Proceedings of 4th International Conference LREC.</source>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Rieser</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lemon</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <year>2008</year>
          .
          <article-title>Learning Effective Multimodal Dialogue Strategies from Wizard-of-Oz Data: Bootstrapping and Evaluation</article-title>
          .
          <source>In Proceeding of ACL08:HLT</source>
          , pp.
          <fpage>638</fpage>
          -
          <lpage>646</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Shah</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , Hakkani-Tu¨r, D.,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , Tu¨r, G.,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <article-title>Bootstrapping a Neural Conversational Agent with Dialogue Self-Play, Crowdsourcing and On-Line Reinforcement Learning</article-title>
          .
          <source>In Proceeding NAACL-HLT</source>
          <year>2018</year>
          , pp.
          <fpage>41</fpage>
          -
          <lpage>45</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Straka</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Strakova´</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <year>2017</year>
          Tokenizing, POS Tagging,
          <article-title>Lemmatizing and Parsing UD 2.0 with UDPipe</article-title>
          .
          <source>In Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw</source>
          Text to Universal Dependencies, pp.
          <fpage>88</fpage>
          -
          <lpage>99</lpage>
          Vancouver, Canada,
          <year>August 2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <surname>Wen</surname>
          </string-name>
          , T.-H.,
          <string-name>
            <surname>Vandyke</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mrksic</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gasic</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>RojasBarahona</surname>
            ,
            <given-names>L.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Su</surname>
          </string-name>
          , P.-H.,
          <string-name>
            <surname>Ultes</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Young</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <article-title>A network-based end-to-end trainable task-oriented dialogue system</article-title>
          .
          <source>In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics</source>
          , vol.
          <volume>1</volume>
          , pp.
          <fpage>438</fpage>
          -
          <lpage>449</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Papangelis</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rudnicky</surname>
            ,
            <given-names>A.I.</given-names>
          </string-name>
          ,
          <year>2015</year>
          .
          <article-title>TickTock: A Non-Goal-Oriented Multimodal Dialog System with Engagement Awareness</article-title>
          . AAAI Spring Symposia.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>