<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Guiding Users by Dynamically Generating Questions in a Chatbot System</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jannis Pilgrim</string-name>
          <email>jannis.pilgrim@campus.tu-berlin.d</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jakob Kemmler</string-name>
          <email>jakob.kemmler@campus.tu-berlin.d</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>MoritzWassmer</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>SilvioEchsle</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>AndreasLommatzsch</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop Proceedings</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>DAI-Labor, TU Berlin</institution>
          ,
          <addr-line>Ernst-Reuter-Platz 7, D-10587 Berlin</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>LWDA'22: Lernen</institution>
          ,
          <addr-line>Wissen, Daten, Analysen</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Technische Universität Berlin</institution>
          ,
          <addr-line>Straße des 17. Juni 135, D-10623 Berlin</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>format. Chatbots eficiently support users in finding relevant answers in complex domains. They aggregate data from diferent sources and provide information in an interactive dialog. In a conversation, chatbots mime human experts providing information in well-consumable pieces. They try to guide users towards predicted information needs. One challenge for chatbots consists in generating questions if user inputs are ambiguous or incomplete. Computing good counter-questions requires an understanding of the user's intentions and a good structuring of the data to provide the needed information in a suitable In this work we present a solution for generating clarification-questions based on dynamic data collections applying semantic clustering and flexible questions trees. We optimize and evaluate our approach for a chatbot tailored for answering questions related to services ofered by the local Public Administration. We show that our approach eficiently helps users to find the relevant information in a natural conversation avoiding long lists for potentially interesting search results. The approach is based on a data enrichment and knowledge extraction pipeline that enables the adaptation of the components to diferent knowledge sources and the specific requirements of new domains.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>The rapid advances of NLP techniques in the last years and the big popularity of social media
chat systems have led to a growing interest in chatbots. Chatbots mime the behavior of a human
expert and provide relevant information and answers to users in suitable “pieces”. Compared
with complex web documents or long lists provided by search engines, chatbots guide users in
a natural dialog to the needed information.</p>
      <p>Providing an adequate answer to a complex user question is a challenging task, since user
questions can be related to a huge range of topics and aspects. Moreover, user questions are
often imprecise and ambiguous due to limited knowledge of the domain. Thus, a chatbot must
determine the most relevant information based on the context and the knowledge about the user
(making sure that all potentially relevant cases are considered). The computation of potentially
relevant responses can be eficiently done based on Information Retrieval methods, optimized
of finding potentially relevant information in large data collections. Determining the response
best-fitting the user’s intention usually requires additional information, which chatbots must
collect by asking the right questions. The generation of good questions requires a precise
understanding of the user question, a deeper analysis of potentially matching answers, and NLP
techniques for generating questions and understanding given text snippets.</p>
      <p>In this paper we analyze the scenario of optimizing a chatbot tailored for providing answers
related to the public services of a major German city. In our scenario, citizens need information
about the services ofered by the public administration (e.g. how to get a residential parking
permit or how to get a passport for babies). The conversation is usually started by the user with
an initial question. Based on the question, the chatbot predicts potentially relevant services. In
order to find the demanded information, the chatbot generates questions optimized for reducing
the ambiguity in the user question and to reach the requested information with a minimal
number of steps. We develop a component that deploys language models and dynamic decision
trees for guiding the user to get the intentionally demanded answers.</p>
      <p>The remainder of this paper is organized as follows. In Sect2iornelated research is
summarized along an explanation to how they contributed to our work. Sec3tidoenscribes our
approach and the structure of the underlying data. Subsequently, the evaluation of our approach
is presented in Section4, through quantitative and qualitative analysis of the chosen methods
and developed chatbots. Finally, Sectio5ndiscusses the accomplishments of this paper and
provides an outlook on future work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>In this section we review prior work on asking clarifying questions for conversational
Information Retrieval (IR) and related methods. We first look at the usefulness of clarifying questions;
then we review related approaches for systems asking clarifying questions. We conclude this
section by analyzing clustering methods for IR and reviewing literature on decision trees for
potentially relevant knowledge extraction methods that can be used in generating clarifying
questions.</p>
      <p>
        Usefulness of Clarification Questions Presenting information on small screen devices or
in voice-only situations is challenging. Information access can be improved by systems that
actively support interaction with the use1r][. It has been shown, that users like to be asked for
clarification [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]. Information obtained by clarifying questions can make huge improvements
in terms of retrieval performanc4e].[
Clarification Question Models Zamani et al[.3] proposed three models to generate
clarification questions and candidate answers for open-domain web search. The authors combined a
rule-based slot filling model using question templates, a supervised model, and a reinforcement
learning-based model. By aggregating huge collections of log data, weak supervision signals
from query reformulation data are extracted and used for improving the models. The approach
is not applicable in our scenario due to the small amount of log data that could be used for
query reformulations learning.
      </p>
      <p>Rosset et al.[5] built 2 conversational question suggestion models based on a BERT-based
ranker and a GPT-2-based generator. They trained the ranking model in a multi-task fashion,
mainly on weak supervision labels obtained from past users behavior like clicks on “People
Also Ask” (PAA) Panes but also on human annotated relevance labels. Their natural language
generation model is trained on the PAA questions which were clicked after the user issued a
query. In our scenario a PAA approach can not be used due to the lack of suficient logged
existing user questions.</p>
      <p>Aliannejadi et al[.4] collected a dataset named “Qulac” by crowd-sourcing to foster research
in Clarifying Questions in IR for open-domain. They proposed a conversational search system
that selects a clarifying question from a pool of questions and ranks stored documents based on
the users answer. The researchers split up the task in question retrieval, selection and document
retrieval. This approach requires predefined clarifying questions as well as
query-questionanswer-target mappings to train all components of the system. Due to the resource constraints
of this project, an equivalent dataset cannot be generated, making this approach a bad fit for
our use-case.</p>
      <p>
        Datasets Even though a lot of datasets related to conversational search 6e,x4is,t7[
        <xref ref-type="bibr" rid="ref2">, 8, 9, 2, 10</xref>
        ]
most of them are either too domain-specific or not suitable regarding their structure to make
use of them for our task. In addition, most of the datasets are in English language while our
domain lies in German public administration language, which involves a lot of unique words.
Therefore, we found those datasets not to be helpful for solving our problem like for example
transfer learning existing models for our use case.
      </p>
      <p>Clustering Text For an eficient refinement of a set of potential topics, the resources must be
categorized first. This categorization might be based on the annotations provided by the dataset,
which mostly follows a textual format. Following, we will discuss two papers, investigating
diferent aspects of the concept of document clustering.</p>
      <p>Leouski and Croft [11] compared diferent clustering techniques for analyzing textual retrieval
results. They found good success using agglomerative hierarchical clustering algorithms in
combination with frequency-based embeddings. Additionally, their results showed that human
oriented evaluation should be preferred over an artificial one.</p>
      <p>Mohammed et al[.12] analyzed document clustering by comparing two strategies based on a
variety of popular evaluation methods. The researchers present an approach based on semantic
embeddings in combination with a density-based clustering algorithm (DBSCAN) to outperform
a frequency based embedding in combination with K-Means, especially on large datasets.
Discussion Most of the related works have in common that data and feedback signals were
either crowd-sourced, or obtained by creating weakly supervised labels with the help of massive
log data. Therefore they were able to train models with supervised methods. In many
conversational IR settings outside of major web search industry, there is most likely not suficient data for
weak supervision available, or there are not enough resources to obtain data (as in our use-case).
Approaches built upon clustering methods or decision tree-based keyword selection combined
with question templates may tackle these problems, since they do only require keywords for
the documents to be retrieved. Therefore, it is worthy to investigate the proposed approaches.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Approach</title>
      <p>In this section we give a comprehensive explanation of the three approaches developed in the
context of this paper. First, we explain the dataset at the basis of our approaches. Then, a
description of the fundamental algorithm, shared by all approaches is provided. After that, we
give a detailed description of the implemented approaches as well as the technologies used.
3.1. Data
The basis of our approaches is an annotated dataset provided by the city of Be1.rlTihne dataset
consists of 881 descriptions of services ofered by the city administration, such as the renewal
of a personal ID card or getting a residential parking permit. Each service entry is created
by a human expert (“editor”) with a list of keywords describing the respective service. These
keywords consist of nouns, verbs, and numbers. A simplified example is shown in Tabl1e.</p>
      <sec id="sec-3-1">
        <title>3.2. General Approach</title>
        <p>Our three approaches share the first step with the chatbot in place - the first user interaction in
the form of an initial question. The user question is sent to Aanpache Solr2 server, providing
an eficient full-text access to the aforementioned dataset. Given the query, the server retrieves
a list of relevant resources. This list is iteratively refined until the desired resource is found.
This refinement process consists of four steps.</p>
        <p>First, all relevant documents are categorized and grouped based on their respective keyword
annotations. Then, based on a heuristic, one group is selected and a superior term, representing
all resources in that group, is inferred. Based on this superior term, a binary counter question
using a question template is constructed and presented to the useDro(es your question revolve
around topic X ? ). When the user answers afirmative, all resources but the ones contained in
the selected group are removed from the list of possible resources. Otherwise, all resources</p>
        <sec id="sec-3-1-1">
          <title>1https://service.berlin.de/</title>
          <p>2https://solr.apache.org/
contained in the selected group are removed. This procedure is repeated until only one resource
is left which is then presented to the user as the final answer.</p>
          <p>The implementation of this algorithm raises two main challenges: the categorization of the
resources and the inferences of the superior term describing the selected group. Following, we
give a description of three approaches tackling these challenges using diferent strategies and
technologies.</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.3. Service Clustering</title>
        <p>The first approach, we nameService Clustering, is based on the applied method of grouping
resources. For a visual depiction, see Fi1g.. After the initial list of relevant resources is
retrieved from the indexed dataset, resources are categorized through clustering. To enable
this, a meaningful representation is needed to make resources directly comparable to each
other. This meaningful representation is created by encoding the keyword annotations for each
resource using a text embedding. The embedding that was used is TF-IDF-based, as annotations
commonly overlap between resources. The TF-IDF embedding emphasizes keywords that are
distinct between resources. As those keywords carry the most information for diferentiating
between resources, this embedding provides vectors optimized for our setting. For clustering
on the resulting vector representations, two algorithms were tested: K-Means and DBScan.</p>
        <p>After clustering, one cluster is selected for generating a counter question. This selection
should be made such that the information gained from the user is maximized, independent of
the answer. Here, always picking the largest cluster is best strategy as the number of resources
that can be eliminated is maximized.</p>
        <p>Having selected the largest cluster, a superior term needs to be inferred. To achieve this,
a semantic embedding was used. Semantic embeddings encode words such that the distance
between their respective vectors corresponds to the semantic similarity between the words.
Such a representation was created for every keyword contained in the annotation of at least one
resource in the selected cluster. For this, the large German m3ofrdoeml theSpacy library was
used. As the annotation vocabulary is highly domain specific, a reliable semantic embedding
based on the used language models could not be computed for all potential relevant keywords.
For the inference only keywords were used for which an embedding could be created. After
the encoding step, the superior term was selected as the keyword whose representation has
the smallest summed up distances to all other representations. This keyword forms the cluster
centroid and is semantically closest to all words in the cluster, thus representing it best. As the
measure of distances between vectors, cosine similarity was used.</p>
        <p>Clustering Algorithms
For clustering the services, two algorithms were tested: K-Mea1n3]s a[nd DBSCAN [14]. The
following gives a short description on the thought process behind this decision. Those two
algorithms are commonly used in diferent problem settings surrounding text clusteri1n5g, 1[6,
17]. As both are based on diferent ideas and therefore come with varying drawbacks, they are
often compared to each other to find the best performing approach in a certain domai1n8,[19].</p>
        <sec id="sec-3-2-1">
          <title>3https://spacy.io/models/de#de_core_news_lg</title>
          <p>K-Means K-Means is one the most popular clustering algorithms with a variety of applications,
including document classification. In K-Means, the number of desired clusters needs to be
specified. With the binary question template used in this approach, this is beneficial. In each
iteration the maximum percentage that the set of relevant documents can be guaranteed to be
pruned, is 50%. This is the case when the clustering results in two equally sized clusters. No
matter the user’s answer, half of the resources can be eliminated, resulting in a logarithmic
conversion speed. K-Means allowing to specify the number of clusters to two might give a good
approximation of these ideal conditions.</p>
          <p>On the other hand, large clusters come with a significant drawback. As the number of
resources per cluster increases, the complexity of finding a representative superior term
increases. This might result in imprecise counter questions and therefore an error prone retrieval
performance.</p>
          <p>DBScan DBScan is a density-based algorithm, commonly used in the context of document
classification. In contrast to K-Means clustering, DBScan does not require a fixed number of
clusters. Instead a parameteerpsilon is defined, specifying the maximum distance between two
data points for them to be considered to be in one cluster. The methods results in a dynamically
adapted number of clusters. That allows us to control the in-cluster similarity and to facilitate
the inference of superior terms. In our experiments we found an epsilon1.o3fto result in the
best quality of superior terms. A high in-cluster similarity also results in smaller clusters and
therefore a larger number clusters. This is likely to limit the number of resources that can be
eliminated in each iteration, resulting in a slower conversion speed.</p>
          <p>Initial Query</p>
          <p>Cluster
Services
yes</p>
          <p>Infer
Superior</p>
          <p>Term
Result Set
&gt; 1 ?</p>
          <p>no
Return Resource</p>
          <p>User Answer</p>
          <p>Ask
Counter
Question</p>
          <p>Refine
Result Set</p>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>3.4. Question Tree</title>
        <p>The next approach, we call thQeuestion Tree approach, is based on the common decision tree
algorithm ID32[0]. Contrary to its usual usage in classification it is herein applied to the topic of
information retrieval. It is constructed on run-time for every single query. Each node represents
a question to the user and the target variable is the service name, implying that purity is reached
only if the sample size in that branch is one.</p>
        <p>The keywords annotation (Tabl1e) is used as a decision variable for the tree. Each keyword
is mapped as a Boolean variable, indicating that a service either has it in their keyword list
(equals 1) or not (equals 0). This results i n×a matrix, wher e is the number of services in a
result set and the number of keywords associated with any of these services. The result of
the transformation is shown in Tabl2e.</p>
        <p>Fig. 2 depicts a fully constructed tree for the fictitious example of an initial query of
“identification” originating from the just-seen tables. Each node, in orange, represents a question
about the keyword in its title. The heuristic of finding the group (the keyword) to ask the user
about is choosing the variable that maximizes information, which is a greedy approach. The
representation is always the keyword itself.</p>
        <p>yes</p>
        <p>Parking?
yes
no
User: “identification”</p>
        <p>Apply?
no
yes Lost? no
Parking ID
Application</p>
        <p>ID card
Application</p>
        <p>Parking
ID lost
yes</p>
        <p>Pet?
Info on Pet</p>
        <p>ID Card
no</p>
        <p>Change
address on</p>
        <p>ID card</p>
        <p>Due to the keywords being annotated by human annotators, imperfections could be found.
These imperfections came in the form of diferent keywords that carry the same semantic
information. This leads to unwanted side efects as connections between corresponding resources
could not be made. In some cases, this makes for a frustrating user experience as sequential
counter questions might ask for information already provided by the user. Additionally, this
artificially blows up the decision tree. These mistakes in the annotation are of four categories:
synonyms, diferent spelling of the same word, diferent grammatical surface forms of the same
word, words very close in semantic information.</p>
        <p>To tackle this problem, we grouped these “similar” words to find a representation for each
group. In order to find these groups, an initial clustering was applied, before the other steps
were executed. As the number of semantically unique words and therefore the number of
clusters was not known beforehand, DBSCAN21[] was used again. We set epsilon to 0.2 to
ensure a high semantic similarity in each clustSepra.cys4 large German model is used for
defining a semantic embedding; the cosine similarity in the vector space is used for computing
the similarity. Fig3. shows the program flow of the chatbot using the question tree approach.
yes</p>
        <p>Choose KW
Resultset
&gt;k?</p>
        <p>no
Return resultset</p>
        <p>User Answer</p>
        <p>Ask
Question
Refine
Resultset</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Evaluation</title>
      <p>In this section we describe our evaluation procedure and present the results obtained from the
diferent approaches.</p>
      <sec id="sec-4-1">
        <title>4.1. Quantitative Analysis</title>
        <p>Procedure The main objective of the chatbot is to help users in finding the desired information
quickly and to suggest only the service fitting the users intent. To ensure reproducibility and
comparability in the evaluation, user interactions were simulated while logging of various
measures is performed in the background. The test dataset consists of about 6,500 real user
dialogues. We connected every initial queryto a service the user has clicked onat some
point of their dialog with the chatbot and assumed this to be their actual search intent, e.g.
ground truth. As can be seen in Algorith1m, the result se t initially returned from thSeolr
system is iteratively refined based on simulated answers to the chatbots question. We reduced
complexity by the assumption that users are always able to answer all the questions correctly.</p>
        <p>The evaluation is biased towards actual chatbot usage, as only 363 of the total 881 services
could be mapped to the initial user queries.</p>
        <p>Algorithm 1 Evaluation Procedure Pseudocode
for (, ) in  do
 ← . getResults()
while length() ≥ 
 ← ℎ.
  ←
 ← ℎ.</p>
        <p>end while
end for
do</p>
        <p>getQuestion()
ifndCorrectAnswer(, , )
refineResultset (, ,  )
▷
simulate correct user answer</p>
        <p>If the initial query yields a result set of any length, all three approaches are guaranteed to
ifnd the intended user services, respectively a result set of length.This is due to the fact that
the Service Clustering approaches re-cluster at every iteration and that every combination of
keywords is unique. We introduce two diferent measures to compare the approaches:
• Mean Turns: Mean turns needed to find a service
• Mean Information Gain: Mean information gain of the answer to a question
The Information Gain (IG) of an iteration (Choose Question-Answer-Refine)is defined as
the diference between the natural logarithm of the length of the result sbeetfore and that
after the answer, depicted in following equation:
 () =
ln(ℎ(
−1 )) − ln(ℎ(
 ))
(1)
Results We evaluate the diferent approaches quantitatively with the just described procedure
and iterate unti l= 1 . Fig. 4 shows the distribution over how many turns are needed until a
conversation converges and Tab3leholds all results of the quantitative analysis. The
distributions are varying significantly. SC KMeans takes with over five turns on average the longest
to converge but is also the most prone to outliers (e.g. services that might be hard to find)
and very evenly distributed. SC DBSCAN is converging faster on average but includes some
conversations taking over 25 turns. The Question Tree approach using DBSCAN outperforms
all other approaches in terms of speed with less than four turns and an information gain of
0.950 per question on average. None of the three approaches dominate all measures - however
the Question Tree seems to more suited for the use case, at least according to the quantitative
approach. In the following, we will highlight the qualitative point of view.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Quality of Questions</title>
        <p>In this section we analyze the quality of the generated questions focusing on the keywords
chosen by the approaches.</p>
        <p>SC DBSCAN</p>
        <p>SC KMeans
Algorithm</p>
        <p>Question Tree DBSCAN</p>
        <p>The keywords used as the basis for the generated questions should come from the user’s
vocabulary (to ensure that the user knows the terms) and the keywords should be unique and
simple (to minimize the ambiguity). When analyzing the approaches, an interplay between
two characteristics of chatbots can be observed. Chatbots, whose largest cluster comprises
almost 50% of the services converge particularly quickly on a result set. Clustering methods
that are able to cluster around 50% of the services in each iteration also halve the result set by
about 50%, regardless of the user’s response. Here one can observe parallels to bisection, where
the interval width is halved with each step, to perform in a runtim(e(o))f . The second
relevant property is the representativeness of the keyword. Some approaches tend to find very
general keywords or ask for the same keyword multiple times. This can be attributed to the
fact that too large clusters have been formed, for which it is dificult to find a common keyword.
The qualitative experiments confirm our assumption that chatbots with higher convergence
times ask qualitatively better questions. The specific characteristics of the algorithms have been
studied at several examples. The main observations are explained in the subsequent paragraphs.
Question Tree The question tree is one of the faster converging approaches. This can be
explained by the fact that the focus of the algorithm is on selecting particularly good keywords
and the clusters result from the user’s decision.</p>
        <p>Service Clustering with k-means The approach of service clustering with k-means as
the clustering method was convincing with particularly short dialogues. The reason that this
approach converges so quickly to a solution can also be attributed to the cluster size of the
largest cluster being close to 50% of the result set. This forces in some situations that services
are assigned to large clusters, but have no influence on the keyword in these clusters. These
decisions ultimately make for less precise queries due to an inaccurately chosen keyword.
Service Clustering with DBSCAN The third approach uses the clustering algorithm
DBSCAN. This algorithm determines the number of clusters at runtime, which allows it to create
new clusters according to the number of topics. By adjusting the epsilon parameter accordingly,
this efect can be controlled. This control allows to keep an eye on the convergence speed as
well as on the quality of the questions.</p>
        <p>Overall, our analysis shows that the Question Tree approach provides the best questions.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>In this work, we presented a solution for generating counter questions based on dynamic data
collections. We developed and evaluated three approaches combining semantic clustering and
decision trees. The methods have been optimized to the specific requirements of the chatbot for
the German public administration. Our experiments show that our method generates reasonable
questions, efectively guiding users to desired resources in an intuitive conversational style.
Our solutions provide the basis for a well-working interactive information retrieval system.
Our approach can also be applied to similar scenarios, since it can be used with text collections
of answers or documents. With our findings, we contributed to the research in chatbot systems
and information retrieval. As future work we plan to improve the generation of questions to
improve the natural soundness and to better adapt to the context specific language style.
Acknowledgment
We thank the ITDZ Berlin for supporting the development of the chatbot framework.
[4] M. Aliannejadi, H. Zamani, F. Crestani, W. Croft, Asking clarifying questions in
opendomain information-seeking conversations, 2019.
[5] C. Rosset, C. Xiong, X. Song, D. Campos, N. Craswell, S. Tiwary, P. Bennett, Leading
conversational search by suggesting useful questions, in: The Web Conference ’20, 2020.
[6] H. Zamani, G. Lueck, E. Chen, R. Quispe, F. Luu, N. Craswell, Mimics: A large-scale data
collection for search clarification, in: Proc. of the 29th ACM CIKM, CIKM ’20, ACM, New
York, NY, USA, 2020, p. 3189–3196. doi:10.1145/3340531.3412772.
[7] C. Qu, L. Yang, W. B. Croft, J. R. Trippas, Y. Zhang, M. Qiu, Analyzing and characterizing
user intent in information-seeking conversations, in: The 41st Intl. ACM SIGIR Conf.,
ACM, 2018. doi:10.1145/3209978.3210124.
[8] R. Lowe, N. Pow, I. Serban, J. Pineau, The ubuntu dialogue corpus: A large dataset for
research in unstructured multi-turn dialogue systems (2015). 1d0o.i:18653/v1/W15-4640.
[9] F. Radlinski, K. Balog, B. Byrne, K. Krishnamoorthi, Coached conversational preference
elicitation: A case study in understanding movie preferences, in: Procs. of the 20th SIGdial
Meeting on Discourse and Dialogue, ACL, Stockholm, Sweden, 2019, pp. 353–360.
[10] E. Choi, H. He, M. Iyyer, M. Yatskar, W.-t. Yih, Y. Choi, P. Liang, L. Zettlemoyer, Quac :
Question answering in context, 2018. URLh:ttps://arxiv.org/abs/1808.0703.6doi:10.48550/
ARXIV.1808.07036.
[11] A. V. Leouski, W. B. Croft, An evaluation of techniques for clustering search results,</p>
      <p>Technical Report, 1996.
[12] S. M. Mohammed, K. Jacksi, S. R. M. Zeebaree, Glove Word Embedding and DBSCAN
algorithms for Semantic Document Clustering, in: Intl.Conf. on Advanced Science and
Engineering, 2020, pp. 1–6. doi:10.1109/ICOASE51841.2020.9436540.
[13] E. W. Forgy, Cluster analysis of multivariate data: eficiency versus interpretability of
classifications, biometrics 21 (1965) 768–769.
[14] M. Ester, H.-P. Kriegel, J. Sander, X. Xu, et al., A density-based algorithm for discovering
clusters in large spatial databases with noise., in: kdd, volume 96, 1996, pp. 226–231.
[15] C. Xiong, Z. Hua, K. Lv, X. Li, An improved k-means text clustering algorithm by optimizing
initial cluster centers, in: 7th Intl. Conf. on Cloud Comp. and Big Data, 2016, pp. 265–268.
[16] R. G. Cretulescu, D. Morariu, M. Breazu, D. Volovici, Dbscan algorithm for document
clustering, Intl. Journal of Adv. Statistics and IT&amp;C for Economics and Life Sciences 9
(2019).
[17] R. N. G. Indah, R. Novita, O. B. Kharisma, R. Vebrianto, S. Sanjaya, T. Andriani, W. P. Sari,
Y. Novita, R. Rahim, et al., Dbscan algorithm: twitter text clustering of trend topic pilkada
pekanbaru, in: Journal of physics, volume 1363, IOP Publishing, 2019, p. 012001.
[18] M. A. Ahmed, H. Baharin, P. N. Nohuddin, Analysis of k-means, dbscan and optics cluster
algorithms on al-quran verses, Intl. Journ. of Adv. Computer. Science and Apps. 11 (2020).
[19] D. Xu, Y. Tian, A comprehensive survey of clustering algorithms, Annals of Data Science
2 (2015) 165–193.
[20] Data mining, practical machine learning tools and techniques, in: I. H. Witten, E. Frank,
M. A. Hall, C. J. Pal (Eds.), Data Mining, 4th ed., Morgan Kaufmann, 2017, pp. i–iii.
doi:https://doi.org/10.1016/B978-0-12-804291-5.00014-3.
[21] S. Mohammed, K. Jacksi, S. Zeebaree, A state-of-the-art survey on semantic similarity
for document clustering using glove and density-based algorithms, Journal of Electrical
Engineering and Computer Science 22 (2021) 552–562. do1i:0.11591/ijeecs.v22.i1.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>W. B.</given-names>
            <surname>Croft</surname>
          </string-name>
          ,
          <article-title>The importance of interaction for information retrieval</article-title>
          ,
          <source>in: Procs. of the 42nd Intl. ACM SIGIR Conf., ACM</source>
          , NY, USA,
          <year>2019</year>
          , p.
          <fpage>1</fpage>
          -
          <lpage>2</lpage>
          . doi:
          <volume>10</volume>
          .1145/3331184.3331185.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Kiesel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bahrami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Anand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hagen</surname>
          </string-name>
          , Toward voice query clarification,
          <year>2018</year>
          . URL: https://dl.acm.org/doi/pdf/10.1145/3209978.321016.0
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>H.</given-names>
            <surname>Zamani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dumais</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Craswell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bennett</surname>
          </string-name>
          , G. Lueck,
          <article-title>Generating Clarifying Questions for Information Retrieval</article-title>
          , ACM, NY, NY, USA,
          <year>2020</year>
          , p.
          <fpage>418</fpage>
          -
          <lpage>428</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>