<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Detecting domain-speci c information needs in conversational search dialogues</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alexander Frummet</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>David Elsweiler</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bernd Ludwig</string-name>
          <email>bernd.ludwigg@ur.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Regensburg</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>As conversational search becomes more pervasive, it becomes increasingly important to understand the user's underlying needs when they converse with such systems in diverse contexts. We report on an insitu experiment to collect conversationally described information needs in a home cooking scenario. A human experimenter acted as the perfect conversational search system. Based on the transcription of the utterances, we present a preliminary coding scheme comprising 27 categories to annotate the information needs of users. Moreover, we use these annotations to perform prediction experiments based on random forest classi cation to establish the feasibility of predicting the information need from the raw utterances. We nd that a reasonable accuracy in predicting information need categories is possible and evidence the importance of stopwords in the classi cation task.</p>
      </abstract>
      <kwd-group>
        <kwd>Conversational Search</kwd>
        <kwd>Information Needs</kwd>
        <kwd>Prediction</kwd>
        <kwd>Cooking</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Voice-based interaction systems are changing the way people seek
information, making search more conversational [
        <xref ref-type="bibr" rid="ref14 ref35">14, 35</xref>
        ]. Spoken queries are very
different to typed queries [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] and by mining spoken interaction data, intelligent
assistance can be provided [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. Voice-based digital assistants such as Amazon
Echo and Google Home show that information seeking conversations now take
place in diverse situations embedded in users' everyday lives. They utilise both
knowledge from research on elds at information retrieval and NLP. One
crucial feature for this kind of assistant is the ability to understand and infer user
needs. With conversational search tipped to dominate search in the future [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], it
is crucial to understand how conversations vary in these diverse domains.
      </p>
      <p>
        Many challenges remain for both the interactive information retrieval and
the NLP community to allow systems to be developed to support the complex
tasks suited to this mode of interaction [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. A recent SWIRL workshop
breakout group identi ed key challenges for conversational search including the need
to accurately elicit information needs, correct user misconceptions and provide
the right amount of information at the right time across all possible domains [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>
        Our focus is on the rst of these challenges { need elicitation { speci cally on
understanding and predicting user information needs, which are important for
systems to conversationally identify what a user requires, facilitate appropriate
retrieval and attain relevance feedback [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ]. We study information needs in the
domain of home cooking, which, based on the literature, we believed would be
a fertile context for the kinds of complex needs suited to conversational search
[
        <xref ref-type="bibr" rid="ref10 ref12">12, 10</xref>
        ] and a situation where users simultaneously perform practical, sometimes
cognitively challenging tasks that make searching in the traditional sense
problematic.
      </p>
      <p>Concretely our contributions are the following:
{ we perform an in-situ study that facilitates a naturalistic cooking situation
resulting in the organic development of information needs,
{ we analyse the collected data qualitatively to learn about the diverse types
of information needs which can occur in this context,
{ we utilise machine learning approaches to classify needs using the raw
transcription of participant utterances.</p>
      <p>In doing so, our ndings add to the conversational agents literature where intent
recognition is crucial for determining and planning the next steps of an agent in a
dialogue. Moreover our initial results are insightful for the future development of
conversational search systems as they show that within this context it is possible
to detect the kind of need a user has based on the raw speech utterances. Note,
however, that we are reporting preliminary ndings and plan to extend our
analyses in the future.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>Our work relates to research contributions across diverse elds of the computer
and information sciences especially at the intersection of natural language
understanding and arti cial intelligence. Here we link the elds by highlighting
contributions on conversation, conversational agents and understanding and predicting
user needs and goals.
2.1</p>
      <sec id="sec-2-1">
        <title>Conversational Agents</title>
        <p>
          Being able to detect and process user intents is a crucial and challenging part
in the development of conversational agents. Typically, natural language
understanding is performed with using a dialogue manager that processes user input
in a way that the agent understands what to do next [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. One important
aspect for understanding user intent is maintaining the context. For this purpose,
user models are generated (e.g. [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ], [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ]) but also linguistic concepts such as
dialogue acts (e.g. [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ], [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ]), meaning relations [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] and sentiment analysis [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]
are relevant facets that need to be considered for intent recognition.
        </p>
        <p>
          With the growing popularity of conversational search systems which are
dened as systems \retrieving information that permits a mixed-initiative back
and forth between a user and agent, where the agent's actions are chosen in
response to a model of current user needs within the current conversation, using
both short- and long-term knowledge of the user" [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ] the aforementioned
concepts become relevant for user needs elicitation. This de nition highlights the
importance of memory, where the system can recall past interactions and
reference these explicitly during conversations as it is done with dialogue managers in
conversational agents. Such systems have emerged not only due to hardware
developments, but because traditional search systems are unsuited to the complex
tasks people perform [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ].
2.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Predicting information needs conversationally</title>
        <p>
          Understanding and algorithmically predicting user needs can be useful for many
reasons: di erent results can be shown [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], results can be presented di erently
[
          <xref ref-type="bibr" rid="ref32">32</xref>
          ] or answers can be presented directly in the results page [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. Conversations
with the user are one means of detecting such information needs. Automated
conversational agents can provide personalization and support users in selecting
an item [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ] or talking about areas of interests [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ] and have been applied in
scenarios such as in trauma therapy [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ]. This often requires systems to exhibit
the memory property referred to earlier in order to maintain an understanding
of context [
          <xref ref-type="bibr" rid="ref17 ref2">17, 2</xref>
          ].
        </p>
        <p>
          In conversational search preliminary work has utilised user speech utterances
as a means to identify information needs. Shiga et al. [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ] classify needs along
two dimensions, the rst of which uses Taylor's levels of speci cation [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ] and the
second, which delineates type based on a classi cation derived from the
literature. They, moreover, incorporate an aspect of task hierarchy, where a main task
(e.g. booking a holiday) can be viewed as consisting of sub-tasks (e.g. ndings a
destination, comparing ight schedules etc.). Their work shows that information
need categories can be distinguished using machine learning approaches. This
work represents an excellent contribution and is the closest to our own research
in terms of motivation and approach. However, the categories of needs predicted
are very high level and domain unspeci c. One could imagine that
conversations and the types of support required across domains could be quite di erent.
If systems could identify speci c need types within speci c domains,
conversational systems could provide much more appropriate assistance. Thus, building
on Shiga et al.'s work we test similar approaches in a home cooking context.
3
3.1
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Methods</title>
      <sec id="sec-3-1">
        <title>Data Collection</title>
        <p>
          To establish a corpus of naturalistic conversational data large enough to perform
machine learning prediction, we devised an in-situ user study. We simulated a
natural cooking situation by gifting a box of ingredients to participants in their
own kitchen at meal time. Participants were tasked with cooking a meal which
they had not cooked before based on as many of the contained ingredients as
possible, although these could be supplemented with the contents of their own
pantry. To assist the process they could converse with the experimenter who
would answer questions and needs using any resource available to him via the
Web. The experimenter provided the best answer he could and communicated
this orally in a natural human fashion (arguably the optimal behaviour for a
conversational system). No time constraints were imposed for the task. Concretely,
for each participant, the procedure comprised six steps:
1. The instructions were read to the participant.
2. Participants signed a consent form explaining how the collected data would
be stored and used in the future.
3. The ingredient box was provided.
4. The recording device was tested.
5. Participants started the cooking task and the full dialogue between
experimenter and participant was recorded.
6. After the task, the experimenter thanked the participant and gifted the
remaining ingredients.
To ensure divergent recipes and conversations the ingredient boxes varied across
participants. The ingredients typically had a value of around e 10 and were
chosen based on guidelines by the German Nutritional Society [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], which suggest
7 categories of ingredient are required for a balanced meal. Typically the box
contained some kind of grain or starch (e.g. potatoes or rice), a selection of
vegetables and a source of protein (e.g. eggs). Participants prepared diverse
meals using the ingredients, a selection of which can be found in Figure 1.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Participants</title>
        <p>
          Participants were recruited using a snowball sampling technique with a
convenience sample providing the rst group of candidates. These participants, in
turn, were willing to recruit friends and relatives and so on. This method o ers
two advantages. First, it generates a basis for trust among the participants and
the experimenter which [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] claim leads to more informal and open speech. Our
impressions con rmed relaxed and natural behaviour in the experiments.
Second, it allowed a relatively large sample to be achieved. The only requirements
for potential participants were a kitchen and Internet connection. Participants
were not paid for participation but, to increase response rate, ingredients were
gifted.
        </p>
        <p>45 participants (22 females, xage = 24 years, minage = 19 years, maxage = 71
years, 20% non-students) were tested between May 7, 2018 and June 28, 2018.
37 had never used conversational agents before, while four used either Alexa or
Google Home. Asked about their cooking experience, six participants reported
cooking multiple times per week or on a daily basis, 18 said they cook seldom
or not at all and one person regarded cooking as her hobby. The remaining 20
participants stated that they cooked but not on a regular basis.
3.4</p>
      </sec>
      <sec id="sec-3-3">
        <title>Transcription and Identi cation of Needs</title>
        <p>
          In total, 38.75 hours of material were collected with the language spoken
being German. The recorded conversations were transcribed and annotated by
a trained linguist, who was also the experimenter, using the recommendations
by Dresing and Pehl [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. This involved translating any dialectual expressions
into standard German { a step necessary to employ word embeddings (see
section 4.2). The syntax of the utterances remained unchanged by this process.
Thereafter, the utterances were split into queries. Table 1 provides examples of
di erent kinds of utterances treated as a query in our analyses. In general, one or
several questions in a row were counted as one query as long as the experimenter
was of the opinion that the utterances represented the same information need.
Otherwise, the utterances were split and counted as separate information needs.
As can be seen in the examples in Table 1, a direct question has the form of an
interrogative clause. Indirect questions, however, do not exhibit this
grammatical form but can clearly be interpreted as a query or question to the system.
Implicit/explicit actions, do not have the grammatical shape of a question at all,
but can be interpreted by a human as such. This is strongly connected to the
surrounding context and as such, the identi cation was performed by a human,
in this case, a trained linguist. The example in Table 1 illustrates that despite
the fact that the utterance does not exhibit the form of an interrogative clause,
the user implicitly requires an answer to this utterance. The follow-up type
consists of a query, which results in another query after the system has answered
the rst. These two queries illustrated in Table 1 were counted as two separate
queries in the corpus. Based on these rules of counting, trials yielded on average
36.93 queries (min = 7, x:25 = 22, x~ = 36, x:75 = 50, max = 73, sd = 17:48,
skewness = 0:26, curtosis = 2:19). The overall number of queries extracted was
Nq = 1662.
        </p>
        <p>Type
Direct question
Indirect question
Implicit/explicit action
Follow up</p>
        <p>
          Example
\What is the cooking time of asparagus?" (part. 42)
\Er { Alex, tell me how I need to cook red lentils." (part. 29)
\Ok, then this is similar to couscous" (part. 34)
\So, at rst the water and then? { System: Put in the
asparagus. { Oh, right from the start? { System: No." (part.3)
We analysed the collected data both qualitatively and quantitatively. First, using
methods akin to content analysis, we examine the information needs identi ed
to establish the variation of needs that occurred. This results in a classi cation
scheme and a set of information needs annotated with an appropriate category.
We continue to report on quantitative experiments, which establish the
feasibility of automatically categorising the queries (information needs) using machine
learning with the raw utterance text.
As with the previous processing of the transcribed utterances, the qualitative
analysis was performed by a trained linguist familiar with dealing with such
data. The starting point for the coding scheme was the set of categories derived
for cooking related questions posted on the Google Answers forum in [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. Out
of the examples provided by Cunningham and Bainbridge we derived category
de nitions. Then, in a process akin to the coding process by Strauss and Corbin
[
          <xref ref-type="bibr" rid="ref30">30</xref>
          ], each query was taken in turn, and a category from the existing scheme was
attempted to be applied. When none of the existing categories were suitable, a
new category was derived and a corresponding de nition was created. Whenever
a new category was established, all existing de nitions were carefully reassessed
to avoid potential overlap. On occasion an utterance included more than one
information need, in which case more than one information need was assigned.
This is reasonable given the fact that conversational search systems are generally
expected to be able to understand pragmatics [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ]. The process was iterative and
tested repeatedly until the researcher was satis ed a consistent classi cation was
achieved. The outcome of this classi cation were 27 di erent information need
categories.
        </p>
        <p>These were used to label all queries. The frequency distribution (see
gure 2) of queries per category is heavily right skewed (x = 61:56, min = 1,
x:25 = 3:5, x~ = 13, x:75 = 60:5, max = 506, sd = 111:77, skewness = 2:79,
curtosis = 10:79). The 10 most frequently assigned categories account for 93%
of all utterances. For space reasons, we limit our descriptions to ve categories1.
The prediction experiments reported are only concerned with the top 10
categories. Next, the categories are explained in descending order of relative
frequency (given next to the label). To assist the reader's understanding we
additionally provide a query example for each category. Quotes of transcripts are
translated from German to English.</p>
        <p>Procedure { 30:45% Utterances were labelled with this category when queries
related to a particular step of the recipe (as opposed to general cooking
techniques, see below). An examples is \What's next after bringing to the
boil?" (part. 42)
1 We plan to publish the full coding scheme and descriptions, as well as the anonymised
transcriptions, both in German and English, as an open dataset to the community.</p>
        <p>Amount { 17:45% The label Amount was used, to code queries from
participants who wanted to know about the quantity of an ingredient needed, e.g.
\How much egg yolk is needed?" (part. 2)
Ingredient { 11:07% Whenever questions regarding which ingredients were
necessary for a particular recipe occurred, these utterances were tagged with
label Ingredient. A typical example was \Which ingredients are needed?"
(part. 1).</p>
        <p>Cooking Technique { 8:24% Utterances/Queries were labelled this way when
participants requested information about preparing ingredients that was
not made explicit in the steps of the recipe. For example, a recipe would
state \cook the asparagus". Participants not knowing how to cook
asparagus would then ask \How does one actually cook asparagus? Can you look
this up, please?" (part. 2).</p>
        <p>Name of Dish { 6:20% was used in cases where participants searched for recipes
they would like to prepare as their main dish. They often used ingredients
as search items in such cases, e.g. \Then I'd like to have a dish with lentils,
chickpeas and tomatoes" (part. 10)
4.2</p>
      </sec>
      <sec id="sec-3-4">
        <title>Predicting Information Needs</title>
        <p>
          The quantitative analysis was formulated as a prediction task i.e. given a set
of features derived from the raw conversational utterances and context
information, is it possible to predict the category of information need. We employed a
random forest classi er for this purpose because it turned out to be an e ective
approach in Shiga et al.'s work [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ]. All experiments reported below are using
the Python package scikit-learn [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ] and are based on 10-fold cross validation.
Table 2 presents the result of experiments reporting average accuracy including
95% CIs based on 100 replications. The variance in accuracy converged after
30 replications.
        </p>
        <p>
          As the use of word embeddings was shown to be bene cial for predicting
information need categories in [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ], we used these word embeddings as a
baseline feature for all classi cation experiments. To this end, we employed
200dimensional word embeddings trained on 2 million German language Wikipedia
articles [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] for the classi cation task. Using these 200-dimensional word
embeddings as features yielded an average accuracy of :4024 [:3355;:4697].
        </p>
        <p>Several additional features were developed to improve this baseline
performance. First, to incorporate the idea of memory into the system, the previous
information need was operationalized as a predictive feature.</p>
        <p>Second, the normalized sequence ID was added as a context feature. The
sequence ID represents the position of an information need in a cooking session
with the information need with ID 1 being the rst to occur, ID 5 being the fth
and so on. Normalizing the sequence ID was necessary because some sessions
were considerably longer than others (see section 3.4).</p>
        <p>
          Next, we examined the vocabulary used more closely. We employed stopword
removal using using the German language stopword list available as part of the
nltk Python package [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. As a result, the accuracy signi cantly decreased to an
Feature Combinations
Avg. Accuracy 95%-CI
Word Embeddings
Word Embeddings + Previous Needs
Word Embeddings + Previous Needs + Normalized Sequence IDs
average of :3289 [:2548;:4065], indicating that stopwords are in fact meaningful
and relevant in the cooking task context. In a second run, only the top 50
words were used because these represent 50% of all words in the corpus and
the collection frequency strongly decreases after the 50th word. This results
in a small increase in average accuracy to :4480 [:3797;:5186] compared to the
baseline.
        </p>
        <p>
          In a nal round of experimentation, we analyzed the impact of resampling on
the prediction accuracy. As described above (see section 4.1), the distribution of
classes was heavily skewed. We performed oversampling using SMOTE [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] and
NearMiss-2 [
          <xref ref-type="bibr" rid="ref37">37</xref>
          ] as undersampling approach. We employed imbalanced-learn [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]
as resampling library, which led to a signi cant increase in average accuracy
(:6391 [:5313;:7467]).
5
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Discussion</title>
      <p>In this work we performed an in-situ cooking study where participants were
given a box of ingredients and charged with cooking a meal of their choice. The
study provided a corpus of conversations whereby diverse information needs
were communicated to an experimenter simulating a personal assistant in
natural language. This corpus provided the basis for us to study the information
needs, which can occur in this context and run prediction experiments to
determine if the type of need could be automatically predicted. Discussing the
results we gain from the analysis is done along four di erent lines: peculiarities
of conversations in the cooking domain, the importance of memory in
predicting information needs, aspects of conversational style and eliciting information
needs in the domain of cooking.</p>
      <p>
        What is special about conversations in this domain? We identi ed 27 ne grained
information need categories, ten of which were su cient to label 93% of all
queries. The information need taxonomy presented [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] was used as a starting
point for the qualitative analysis. Comparing our results to those in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] yields
di erences in terms of occurrence and distribution of information needs. Only
      </p>
      <p>
        Ingredient and Name of dish are frequent in both studies. While Cunningham
and Bainbridge report Name of dish/item, Ingredient, Type of dish, course, meal
and Ethnic/National/Region being most common, Procedure is the most
frequently used label in the data we collected. Indeed, Type of dish, course, meal
and Ethnic/National/Region are rarely applied in our corpus { and vice versa
for Cooking Technique. The di erences can largely be explained by the fact that
in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] questions were not posed while actually cooking. Thus, categories like
Amount, Time, Time Report and Knowledge did not occur in their study. These
information needs tend to be more related to actual cooking tasks than being
descriptors for text-based search for recipes. In terms of information need
categories we nd two commonalities with [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ] { despite the di erence in domains
between their study and ours. First, some of the information need categories in
the cooking task scenario show the hierarchical relationship of main and
subtasks. Categories Procedure and Procedure Moment are good examples: While
the rst refers to all steps needed to provide a meal, the latter is concerned with
a speci c step throughout the cooking process. Second, the di erent levels of
task (with the exception of \search") are mirrored in our data. We nd queries
relating to topical knowledge about the cooking task (see category Procedure)
as well as those relating to problem solving (see category Cooking Technique)
and situation (see category Time Report ).
      </p>
      <p>
        Link between natural dialogue and memory The fact that adding the previous
information need as a feature did not increase the accuracy values achieved is
a surprising result. This is in contrast to the importance of memory which can
be derived from theoretical work (see e.g. [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]) and also some task conditions
in [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]. One reason for this result may be a lack of data to gain the expected
results. Future work will, consequently, be dedicated to gathering more data and
a reassessment of the e ect memory has. A second possible explanation might
be the existence of user subgroups. Users who cook on a regular basis may have
di erent sequences of needs than those who prepare meals less frequently.
The use of conversational style By running experiments with and without
stopword removal we provide empirical evidence that the most heavily used words
are most important to elicit information needs. This is in line with ndings in the
domain of very short text retrieval (e.g. [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]) when stopwords are removed. In
the context of cooking tasks many stopwords may be discourse cues, which have
important functions for text comprehension, including easing the reconstructing
of the line of argumentation [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], signaling misunderstanding [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] and facilitate
recall in information processing [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ]. It makes sense that when the stopwords are
removed prediction performance decreases as the line of argumentation in the
discourse is no longer observable as it is destroyed by removing the cues.
      </p>
      <p>
        One compelling area of future research would, thus, be to compile
corpusspeci c stopword lists (e.g. for di erent domains), which is e.g. suggested in the
domain of sentiment analysis [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ].
The need to understand and elicit information needs in a particular domain
The results obtained by our prediction experiments show that the queries issued
to the conversational search system are useful for distinguishing di erent
information needs. Generally speaking, our results suggest that information need
categories during conversation can be predicted with average accuracy values
achieved of up to 64% when resampling is used. Even the non-resampled
performance of 40% are signi cantly larger than chance (which would be 10% with
ten classes). A major reason for the misclassi cation found is the inhomogeneous
distribution of queries over the various information need categories. Procedure
was miss-classi ed as the dominant category in almost each of the remaining
classes. The impact of this class on the accuracy result can be seen in form of a
low average precision ( 32%). A second aspect explaining miss-classi cations
may be the (to some extent strong) semantic similarity between individual
information need categories, e.g. between Procedure and Procedure Moment as well as
between Time and Time Report. Grouping such categories might lead to higher
accuracies. However, detail information gets lost.
      </p>
      <p>Having said this, we identify several challenges imposed to conversational
search systems throughout data collection and preparation. All of these relate to
resolving information needs. Spoken language interactions pose, rst, the
challenge of understanding the pragmatics of dialect use. Dialect { with a variety
of types and levels { was used by almost all participants. Translating these
expressions to standard German is a major problem for speech recognition systems
because it is more than a mere word-by-word translation. On many occasions
a wealth of pragmatics was needed to fully comprehend the queries issued by
participants. A second, related challenge was the fact that information needs
were often not clearly de ned. This means, slicing an utterance into information
needs requires a large amount of world-knowledge which poses major challenges
on conversational search systems.
6</p>
    </sec>
    <sec id="sec-5">
      <title>On the problems with prediction and response to information needs</title>
      <p>The assistance a conversational search system provides will indeed bene t from
the capability to predict user needs. This can be illustrated along three lines,
all of which are grounded in encounters from our corpus. The system can, rst,
focus on the demands of the particular situation { instead of sticking to a static
programme ow. The following excerpt from our corpus is one example:
Er { can you read the ingredients list out loud to me, so I can get them?
{ System: (reads ingredient list slowly and waits for con rmation by
the participant that s/he got a particular ingredient), part. 14
A system showing predictive capabilities can, second, provide feedback with
respect to deviations from anticipated actions, e.g. from a recipe's default
procedure. Depending on what the user said, i.e. based on the discourse markers
present that the system can observe, it can adapt to the new situation.</p>
      <p>I would have added some tomato paste or something like that, so that
it isn't so dry. . . but if it's not in the recipe. { System: (explain that it
would be possible to add this), part. 40</p>
      <p>
        Third, most conversational agents (see [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]) use a rule-based approach to
extract the user's intent. However, they only analyze the \surface" of an
utterance and make decisions based on keywords. This, however, does not necessarily
re ect the true information need by the user which leads to misclassi cations.
Our prediction task o ers the possibility to not only investigate the surface but
to go deeper into semantics using word embeddings. Information needs can be
detected more accurately by using these and a system can thus provide the
information the user really wants.
      </p>
      <p>Can you search for a recipe for Sauce Hollandaise, please? { System:
Sure. (searches and reads the ingredients out loud), part. 2</p>
      <p>One possible solution for solving the aforementioned problems might be to
include more features than just the information need for the classi cation task.
Our hypothesis is that a multidimensional vector including several linguistic
features such as dialogue acts and the current task state might improve the
performance when predicting the user's information need and intent. Currently,
we are working on relabeling the corpus across various dimensions going
beyond cooking-speci c information needs. By doing this, we aim to gain a
treestructured coding of the data that might help us to analyse the conversational
structure on di erent levels. We hope that this will improve the classi cation
performance, too.
7</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion and Future Work</title>
      <p>Our preliminary results shed light on the information needs which occur in a
home cooking context and indicate the feasibility of identifying needs
automatically. This pilot study emphasizes the feasibility and value of this kind of
approach.</p>
      <p>Future Work will collect additional naturalistic data, to gain more
generalizable results and thus promote research in the conversational search domain.
Also, our results showed that a more detailed classi cation of user utterances
is necessary to classify user intent. Thus, other linguistic dimensions such as
dialogue acts will be incorporated in the ongoing turn annotation of this
corpus. Based on our feasibility study, similar experiments in the cooking domain
or other domains can be conducted to gain higher representativity. Our study
focused on the utterances made by users. However, the utterances made by the
conversational system are equally important as it needs to be capable of
generating utterances suitable for the current context. The utterances made by the
experimenter can thus employed in future work to investigate this issue.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Allbritton</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moore</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Discourse cues in narrative text: Using production to predict comprehension</article-title>
          .
          <source>In: AAAI Fall Symposium on Psychological Models of Communication in Collaborative Systems</source>
          (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Allen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ferguson</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stent</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>An architecture for more realistic conversational systems</article-title>
          .
          <source>In: Proceedings of the 6th International Conference on Intelligent User Interfaces</source>
          . pp.
          <volume>1</volume>
          {
          <issue>8</issue>
          . IUI '01,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , New York, NY, USA (
          <year>2001</year>
          ). https://doi.org/10.1145/359784.359822, http://doi.acm.
          <source>org/10</source>
          .1145/359784.359822
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bernstein</surname>
            ,
            <given-names>M.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Teevan</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dumais</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liebling</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horvitz</surname>
          </string-name>
          , E.:
          <article-title>Direct answers for search queries in the long tail</article-title>
          .
          <source>In: Proceedings of the SIGCHI conference on human factors in computing systems</source>
          . pp.
          <volume>237</volume>
          {
          <fpage>246</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Bird</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Loper</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klein</surname>
            ,
            <given-names>E.: Natural</given-names>
          </string-name>
          <string-name>
            <surname>Language Processing with Python. O'Reilly Media Inc</surname>
          </string-name>
          ., Sebastopol, CA (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Castella</surname>
            ,
            <given-names>V.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Abad</surname>
            ,
            <given-names>A.Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alonso</surname>
            ,
            <given-names>F.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Silla</surname>
            ,
            <given-names>J.P.:</given-names>
          </string-name>
          <article-title>The in uence of familiarity among group members, group atmosphere and assertiveness on uninhibited behavior through three di erent communication media</article-title>
          .
          <source>Computers in Human Behavior</source>
          <volume>16</volume>
          (
          <issue>2</issue>
          ),
          <volume>141</volume>
          {
          <fpage>159</fpage>
          (
          <year>2000</year>
          ). https://doi.org/https://doi.org/10.1016/S0747-
          <volume>5632</volume>
          (
          <issue>00</issue>
          )
          <fpage>00012</fpage>
          -
          <lpage>1</lpage>
          , http://www.sciencedirect.com/science/article/pii/S0747563200000121
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Chawla</surname>
            ,
            <given-names>N.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bowyer</surname>
            ,
            <given-names>K.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hall</surname>
            ,
            <given-names>L.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kegelmeyer</surname>
            ,
            <given-names>W.P.</given-names>
          </string-name>
          : Smote:
          <article-title>Synthetic minority over-sampling technique</article-title>
          .
          <source>J. Artif. Int. Res</source>
          .
          <volume>16</volume>
          (
          <issue>1</issue>
          ),
          <volume>321</volume>
          {357 (Jun
          <year>2002</year>
          ), http://dl.acm.org/citation.cfm?id=
          <volume>1622407</volume>
          .
          <fpage>1622416</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Cieliebak</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Deriu</surname>
            ,
            <given-names>J.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Egger</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Uzdilli</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>A twitter corpus and benchmark resources for german sentiment analysis</article-title>
          .
          <source>In: Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media</source>
          . pp.
          <volume>45</volume>
          {
          <fpage>51</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2017</year>
          ), http://aclweb.org/anthology/W17-1106
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Craswell</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hawking</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Robertson</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>E ective site nding using link anchor information</article-title>
          .
          <source>In: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval</source>
          . pp.
          <volume>250</volume>
          {
          <fpage>257</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Culpepper</surname>
            ,
            <given-names>J.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Diaz</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smucker</surname>
          </string-name>
          , M.D.:
          <article-title>Research frontiers in information retrieval: Report from the third strategic workshop on information retrieval in lorne (SWIRL 2018)</article-title>
          .
          <source>SIGIR Forum</source>
          <volume>52</volume>
          (
          <issue>1</issue>
          ),
          <volume>34</volume>
          {
          <fpage>90</fpage>
          (
          <year>2018</year>
          ). https://doi.org/10.1145/3274784.3274788, https://doi.org/10.1145/3274784.3274788
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Cunningham</surname>
            ,
            <given-names>S.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bainbridge</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>An Analysis of Cooking Queries: Implications for Supporting Leisure Cooking Ethnographic Studies of Cooks and Cooking</article-title>
          .
          <source>In: iConference 2013 Proceedings</source>
          . pp.
          <volume>112</volume>
          {
          <issue>123</issue>
          (
          <year>2013</year>
          ). https://doi.org/10.9776/13160
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Drehsing</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pehl</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Praxisbuch Interview, Transkription &amp; Analyse. Anleitungen und Regelsysteme fur qualitativ Forschende. Dr. Dresing und Pehl GmbH</article-title>
          , Marburg, 6th. edn. (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Elsweiler</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Trattner</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Harvey</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Exploiting food choice biases for healthier recipe recommendation</article-title>
          .
          <source>In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          . pp.
          <volume>575</volume>
          {
          <fpage>584</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13. fur Ernahrung, D.G.:
          <article-title>Die deutsche gesellschaft fur ernahrung e</article-title>
          .v.
          <source>(dge)</source>
          (
          <year>2018</year>
          ), https://www.dge.de/wir-ueber-uns/die-dge/
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Guy</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Searching by talking: Analysis of voice queries on mobile web search</article-title>
          .
          <source>In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          . pp.
          <volume>35</volume>
          {
          <fpage>44</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Jurafsky</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martin</surname>
            ,
            <given-names>J.H.</given-names>
          </string-name>
          :
          <article-title>Speech and Language Processing (2Nd Edition)</article-title>
          . Prentice-Hall, Inc., Upper Saddle River, NJ, USA (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Kiseleva</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hassan</surname>
            <given-names>Awadallah</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Crook</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.C.</given-names>
            ,
            <surname>Zitouni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            ,
            <surname>Anastasakos</surname>
          </string-name>
          ,
          <string-name>
            <surname>T.</surname>
          </string-name>
          :
          <article-title>Predicting user satisfaction with intelligent assistants</article-title>
          .
          <source>In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          . pp.
          <volume>45</volume>
          {
          <fpage>54</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Kopp</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gesellensetter</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          , Kramer,
          <string-name>
            <given-names>N.C.</given-names>
            ,
            <surname>Wachsmuth</surname>
          </string-name>
          ,
          <string-name>
            <surname>I.:</surname>
          </string-name>
          <article-title>A conversational agent as museum guide: Design and evaluation of a real-world application</article-title>
          . In: Panayiotopoulos,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Gratch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Aylett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Ballin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Olivier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Rist</surname>
          </string-name>
          , T. (eds.)
          <source>Intelligent virtual agents: Proceedings 5th International Conference. LNAI</source>
          ,
          <volume>3661</volume>
          , pp.
          <volume>329</volume>
          {
          <fpage>343</fpage>
          . Springer, Berlin (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Leggeri</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Esposito</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Iocchi</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Task-oriented conversational agent self-learning based on sentiment analysis</article-title>
          .
          <source>In: Proceedings of the 2nd Workshop on Natural Language for Arti cial Intelligence (NL4AI</source>
          <year>2018</year>
          )
          <article-title>co-located with 17th International Conference of the Italian Association for Arti cial Intelligence (AI*IA</article-title>
          <year>2018</year>
          ), Trento, Italy, November 22nd to 23rd,
          <year>2018</year>
          . pp.
          <volume>4</volume>
          {
          <issue>15</issue>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19. Lema^tre, G.,
          <string-name>
            <surname>Nogueira</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aridas</surname>
            ,
            <given-names>C.K.</given-names>
          </string-name>
          :
          <article-title>Imbalanced-learn: A python toolbox to tackle the curse of imbalanced datasets in machine learning</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          <volume>18</volume>
          (
          <issue>17</issue>
          ),
          <volume>1</volume>
          {
          <issue>5</issue>
          (
          <issue>2017</issue>
          ), http://jmlr.org/papers/v18/
          <fpage>16</fpage>
          -
          <lpage>365</lpage>
          .html
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Leveling</surname>
          </string-name>
          , J.:
          <article-title>On the E ect of Stopword Removal for SMS-Based FAQ Retrieval</article-title>
          . In: Bouma,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Ittoo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Metais</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Wortmann</surname>
          </string-name>
          , H. (eds.)
          <source>Natural Language Processing and Information Systems</source>
          . pp.
          <volume>128</volume>
          {
          <fpage>139</fpage>
          . Springer, Berlin, Heidelberg (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Mondal</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Lexicon, meaning relations, and semantic networks</article-title>
          .
          <source>In: Proceedings of the 2nd Workshop on Natural Language for Arti cial Intelligence (NL4AI</source>
          <year>2018</year>
          )
          <article-title>colocated with 17th International Conference of the Italian Association for Arti cial Intelligence (AI*IA</article-title>
          <year>2018</year>
          ), Trento, Italy, November 22nd to 23rd,
          <year>2018</year>
          . pp.
          <volume>40</volume>
          {
          <issue>52</issue>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Morbini</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Forbell</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>DeVault</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sagae</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Traum</surname>
            ,
            <given-names>D.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rizzo</surname>
            ,
            <given-names>A.A.</given-names>
          </string-name>
          :
          <article-title>A mixed-initiative conversational dialogue system for healthcare</article-title>
          .
          <source>In: Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue</source>
          . pp.
          <volume>137</volume>
          {
          <fpage>139</fpage>
          . SIGDIAL '
          <volume>12</volume>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computational Linguistics, Stroudsburg, PA, USA (
          <year>2012</year>
          ), http://dl.acm.org/citation.cfm?id=
          <volume>2392800</volume>
          .
          <fpage>2392825</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Pedregosa</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Varoquaux</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gramfort</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Michel</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thirion</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grisel</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blondel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prettenhofer</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weiss</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dubourg</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vanderplas</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Passos</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cournapeau</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brucher</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perrot</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Duchesnay</surname>
          </string-name>
          , E.:
          <article-title>Scikit-learn: Machine learning in python</article-title>
          .
          <source>J. Mach. Learn. Res</source>
          .
          <volume>12</volume>
          ,
          <issue>2825</issue>
          {2830 (Nov
          <year>2011</year>
          ), http://dl.acm.org/citation.cfm?id=
          <volume>1953048</volume>
          .
          <fpage>2078195</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Radlinski</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Craswell</surname>
          </string-name>
          , N.:
          <article-title>A theoretical framework for conversational search</article-title>
          .
          <source>In: Proceedings of the 2017 Conference on Conference Human Information Interaction and Retrieval</source>
          . pp.
          <volume>117</volume>
          {
          <fpage>126</fpage>
          . CHIIR '17,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , New York, NY, USA (
          <year>2017</year>
          ). https://doi.org/10.1145/3020165.3020183, http://doi.acm.
          <source>org/10</source>
          .1145/3020165.3020183
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Saif</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fernandez</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>He</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alani</surname>
          </string-name>
          , H.:
          <article-title>On stopwords, ltering and data sparsity for sentiment analysis of Twitter</article-title>
          .
          <source>In: Proceedings of the Ninth International Conference on Language Resources and Evaluation</source>
          ,
          <string-name>
            <surname>LREC</surname>
          </string-name>
          <year>2014</year>
          . p.
          <volume>810</volume>
          {
          <issue>817</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Schober</surname>
            ,
            <given-names>M.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bloom</surname>
            ,
            <given-names>J.E.</given-names>
          </string-name>
          :
          <article-title>Discourse cues that respondents have misunderstood survey questions</article-title>
          .
          <source>Discourse processes 38(3)</source>
          ,
          <volume>287</volume>
          {
          <fpage>308</fpage>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Shiga</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joho</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blanco</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Trippas</surname>
            ,
            <given-names>J.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sanderson</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Modelling information needs in collaborative search conversations</article-title>
          .
          <source>In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          . pp.
          <volume>715</volume>
          {
          <fpage>724</fpage>
          . SIGIR '17,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , New York, NY, USA (
          <year>2017</year>
          ). https://doi.org/10.1145/3077136.3080787, http://doi.acm.
          <source>org/10</source>
          .1145/3077136.3080787
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Spillane</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gilmartin</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saam</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Su</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cowan</surname>
            ,
            <given-names>B.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lawless</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wade</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Introducing adele: A personalized intelligent companion</article-title>
          .
          <source>In: Proceedings of the 1st ACM SIGCHI International Workshop on Investigating Social Interactions with Arti cial Agents</source>
          . pp.
          <volume>43</volume>
          {
          <fpage>44</fpage>
          .
          <source>ISIAA</source>
          <year>2017</year>
          , ACM, New York, NY, USA (
          <year>2017</year>
          ). https://doi.org/10.1145/3139491.3139492, http://doi.acm.
          <source>org/10</source>
          .1145/3139491.3139492
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <surname>Stolcke</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ries</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Coccaro</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shriberg</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bates</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jurafsky</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taylor</surname>
          </string-name>
          , P.,
          <string-name>
            <surname>Martin</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ess-Dykema</surname>
            ,
            <given-names>C.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meteer</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Dialogue act modelling for automatic tagging and recognition of conversational speech</article-title>
          .
          <source>Computational Linguistics</source>
          <volume>26</volume>
          ,
          <issue>339</issue>
          {
          <fpage>373</fpage>
          (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30.
          <string-name>
            <surname>Strauss</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Corbin</surname>
          </string-name>
          , J.:
          <source>Grounded Theory: Grundlagen qualitativer Sozialforschung</source>
          . Beltz,
          <string-name>
            <surname>Weinheim</surname>
          </string-name>
          (
          <year>1996</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          31.
          <string-name>
            <surname>Taylor</surname>
          </string-name>
          , R.S.:
          <article-title>The process of asking questions</article-title>
          .
          <source>American Documentation</source>
          <volume>13</volume>
          (
          <issue>4</issue>
          ),
          <volume>391</volume>
          {
          <fpage>396</fpage>
          (
          <year>1962</year>
          ). https://doi.org/10.1002/asi.5090130405, https://onlinelibrary.wiley.com/doi/abs/10.1002/asi.5090130405
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          32.
          <string-name>
            <surname>Teevan</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cutrell</surname>
          </string-name>
          , E.,
          <string-name>
            <surname>Fisher</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Drucker</surname>
            ,
            <given-names>S.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ramos</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Andre</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Visual snippets: summarizing web pages for search and revisitation</article-title>
          .
          <source>In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems</source>
          . pp.
          <year>2023</year>
          {
          <year>2032</year>
          . ACM (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          33.
          <string-name>
            <surname>Thompson</surname>
            ,
            <given-names>C.A.</given-names>
          </string-name>
          , Goker,
          <string-name>
            <given-names>M.H.</given-names>
            ,
            <surname>Langley</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.:</surname>
          </string-name>
          <article-title>A personalized system for conversational recommendations</article-title>
          .
          <source>J. Artif. Int. Res</source>
          .
          <volume>21</volume>
          (
          <issue>1</issue>
          ),
          <volume>393</volume>
          {428 (Mar
          <year>2004</year>
          ), http://dl.acm.org/citation.cfm?id=
          <volume>1622467</volume>
          .
          <fpage>1622479</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          34.
          <string-name>
            <surname>Trippas</surname>
            ,
            <given-names>J.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Spina</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cavedon</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joho</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sanderson</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Informing the design of spoken conversational search: Perspective paper</article-title>
          .
          <source>In: Proceedings of the 2018 Conference on Human Information Interaction &amp; Retrieval</source>
          . pp.
          <volume>32</volume>
          {
          <fpage>41</fpage>
          . CHIIR '18,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , New York, NY, USA (
          <year>2018</year>
          ). https://doi.org/10.1145/3176349.3176387, http://doi.acm.
          <source>org/10</source>
          .1145/3176349.3176387
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          35.
          <string-name>
            <surname>Trippas</surname>
            ,
            <given-names>J.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Spina</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cavedon</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sanderson</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>How do people interact in conversational speech-only search tasks: A preliminary analysis</article-title>
          .
          <source>In: Proceedings of the 2017 Conference on Conference Human Information Interaction and Retrieval</source>
          . pp.
          <volume>325</volume>
          {
          <fpage>328</fpage>
          . CHIIR '17,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , New York, NY, USA (
          <year>2017</year>
          ). https://doi.org/10.1145/3020165.3022144, http://doi.acm.
          <source>org/10</source>
          .1145/3020165.3022144
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          36.
          <string-name>
            <surname>Winterboer</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ferreira</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moore</surname>
            ,
            <given-names>J.D.</given-names>
          </string-name>
          :
          <article-title>Do discourse cues facilitate recall in information presentation messages?</article-title>
          <source>In: INTERSPEECH</source>
          <year>2008</year>
          ,
          <article-title>9th Annual Conference of the International Speech Communication Association</article-title>
          , Brisbane, Australia,
          <source>September 22-26</source>
          ,
          <year>2008</year>
          . p.
          <fpage>543</fpage>
          .
          <string-name>
            <surname>ISCA</surname>
          </string-name>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          37.
          <string-name>
            <surname>Zhang</surname>
          </string-name>
          , J.,
          <string-name>
            <surname>Mani</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>KNN Approach to Unbalanced Data Distributions: A Case Study Involving Information Extraction</article-title>
          .
          <source>In: Proceedings of the ICML'2003 Workshop on Learning from Imbalanced Datasets</source>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>