Towards System-Initiative Conversational Information
Seeking
Somin Wadhwa, Hamed Zamani
Center for Intelligent Information Retrieval
University of Massachusetts Amherst
Amherst, MA, United States


                                          Abstract
                                          Presently, most conversational information seeking systems function in a passive manner, i.e., user-initiative engagement.
                                          Through this work, we aim to discuss the importance of developing conversational information seeking systems capable
                                          of system-initiative interactions. We further discuss various aspects of such interactions in CIS systems and introduce a
                                          taxonomy of system-initiative interactions based on three orthogonal dimensions: initiation moment (when to initiative a
                                          conversation), initiation purpose (why to initiate a conversation), and initiation means (how to initiate a conversation). This
                                          taxonomy enables us to propose a generic pipeline for system-initiative conversations, consisting of three major steps asso-
                                          ciated with the three dimensions highlighted in the taxonomy. We further delineate the technical and evaluation challenges
                                          that the design and implementation of each component may encounter, and provide possible solutions. We finally point out
                                          potential broader impacts of system-initiative interactions in CIS systems.

                                          Keywords
                                          Conversational search, conversational information seeking, mixed-initiative conversations, conversational recommendation


1. Introduction                                                                                                    ignored in most recent work in the area of conversational
                                                                                                                   information seeking. This is while mixed-initiative intel-
The rapid growth in speech and small screen interfaces                                                             ligent systems are believed to ultimately revolutionize
has significantly influenced the way users interact with                                                           the world of computing [7], and CIS systems provide
intelligent systems to satisfy their information needs.                                                            an appropriate platform for supporting mixed-initiative
The growing interest in personal digital assistants demon-                                                         interactions.
strates the willingness of users to employ conversational                                                             Recently, some form of such interactions have been
interactions. This has motivated the information retrieval                                                         studied in the context of asking for clarification [8, 9, 10]
community, both academic researchers and industry prac-                                                            or preference elicitation [11, 12]. Developing fully mixed-
titioners, to focus on conversational information seeking                                                          initiative conversational systems requires support for
(CIS) as a major emerging research area.1 It has been also                                                         system-initiative (or agent-initiative) interactions, where
recognized as one of the strategic directions of the com-                                                          the CIS system initiates a conversation with the user(s).
munity in the Third Strategic Workshop on Information                                                              However, system-initiative interactions have been over-
Retrieval in Lorne (SWIRL 2018) [1].2 However, current                                                             looked in the CIS literature. In this paper, we focus on this
models and technology provide limited support to conver-                                                           topic and discuss its importance for IR research and in-
sational understanding and various types of interactions.                                                          dustry. We believe that real-life intelligent assistants can
Recent research has made substantial progress in a num-                                                            substantially benefit from supporting system-initiative
ber of tasks associated with conversational information                                                            interactions and this direction involves a large number of
seeking [2, 3, 4, 5], however, each with various simplify-                                                         unsolved and non-trivial open questions that are worthy
ing assumptions on system abilities and user behavior                                                              of research. To better demonstrate different aspects of
that may not hold in a real-world CIS system [6, 7]. For                                                           the problem, we compile a taxonomy of system-initiative
instance, mixed-initiative interactions have been largely                                                          interactions, based on three dimensions: (1) initiation
                                                                                                                   moment: when to initiate a conversation, (2) initiation
DESIRES 2021 – 2nd International Conference on Design of                                                           purpose: why to initiate a conversation, and (3) initia-
Experimental Search & Information REtrieval Systems, September                                                     tion means: how to initiative a conversation. We believe
15–18, 2021, Padua, Italy
                                                                                                                   that system-initiative interactions can be categorized as
" sominwadhwa@cs.umass.edu (S. Wadhwa);
zamani@cs.umass.edu (H. Zamani)                                                                                    either instant initiation or opportune moment initiation
                                    © 2021 Copyright for this paper by its authors. Use permitted under Creative
                                    Commons License Attribution 4.0 International (CC BY 4.0).
                                                                                                                   interactions. We provide example scenarios for each of
 CEUR
 Workshop
 ProceedingsCEUR Workshop Proceedings (CEUR-WS.org)
               http://ceur-ws.org
               ISSN 1613-0073
                                                                                                                   these categories in Section 2.
               1
       In this paper, we use CIS to refer to all conversational informa-                                              The introduced taxonomy enables us to propose a
tion seeking and access systems, including conversational search,
                                                                                                                   generic pipeline for system-initiative interactions in CIS
recommendation, and question answering.
     2
       https://sites.google.com/view/swirl3/                                                                       systems. The pipeline introduced in Section 3 consists
of three major steps, that are aligned with the three di-    the other hand, contains the interactions that can be ini-
mensions in our taxonomy. We further review technical        tiated at a later time decided by the system.3 Therefore,
challenges in both modeling and evaluating each of these     the interaction time in instant initiation is derived by
steps in addition to discussing potential approaches for     the user’s situational context, e.g., user’s location, time,
end-to-end evaluation of system-initiative CIS systems.      mood, and activity, or the urgency of the interactions
We also highlight the dangers of using system-initiative     (e.g., health and safety related interactions), while in OMI,
interactions in CIS systems if not designed carefully. We    this is the CIS system that decides the interaction time.
finally briefly introduce the broader impact of this re-
search direction. We believe this paper, despite being       2.2. Dimension II: Initiation Purpose
sometimes abstract or hypothetical, sheds light on some
aspect of developing and evaluating system-initiative       Conversation initiation may be triggered by availabil-
conversational information seeking systems.                 ity of a new data that may be of interest to user, by the
                                                            current situation of user such as time and location, or by
                                                            modifications to the CIS system. The latter may happen
2. A Taxonomy of                                            for example if a new deployment of the CIS models leads
      System-Initiative CIS                                 to an understanding that the system provided false infor-
                                                            mation to a sensitive topic in the past interactions and
      Interactions                                          now wants to initiate a conversation to correct its past
                                                            mistake. Given these three triggering reasons, we identify
In this section, we review different interactions that may
                                                            five main purposes for initiating a conversation in a CIS
be taken by a CIS system to initiate a conversation. We
                                                            system. They include information filtering, recommen-
study these interactions with respect to the following
                                                            dation, following up a past conversation, contributing
three orthogonal dimensions:
                                                            to multi-party conversation, and feedback request. Note
• initiation moment: when to initiate a conversation?       that this paper only focuses on information seeking con-
                                                            versations, therefore there exist some non information
• initiation purpose: why to initiate a conversation?       seeking initiation purposes that are not covered in this
                                                            section.
• initiation means: how to initiate the conversation?
                                                               In the following, we describe each of the identified ini-
   We believe that any CIS system should be able to an- tiation purposes. For each initiation purpose presented
swer all the above questions in order to make system- below, we provide instant initiation and opportune mo-
initiative interactions. In the rest of this section, we ment initiation example use-cases in Table 1.
explain these dimensions. This paper also proposes a
pipeline for system-initiative interactions in CIS systems, Filtering streaming information Information filter-
which is inspired by these three dimensions introduced ing systems aim for delivering information to the user
in the taxonomy.                                            from a stream of information contents based on the user’s
                                                            preferences. Belkin and Croft [13] identified information
2.1. Dimension I: Initiation Moment                         retrieval and information filtering as two sides of the
                                                            same coin, because of their fundamental similarities in
Given the first dimension, i.e., when to initiate a con- representing unstructured or semi-structured documents
versation, we partition system-initiated conversational and computing their relevance to the user’s (short- or
interactions into two categories:                           long-term) information needs. A few years later, Robert-
                                                            son and Hull [14] organized the TREC Filtering Tracks to
• Instant initiation: defined as instant initiation of a
                                                            promote the field and provide resources for fostering re-
   conversation is by a conversational information seek-
                                                            search in the filtering tasks. Conversational information
   ing system mostly based on the user’s current situation.
                                                            seeking systems may initiate a conversation with the goal
• Opportune moment initiation (OMI): defined as of information filtering. For instance, introducing the
   initiation of a conversation that can be postponed to breaking news headlines based on the user’s preferences
   an opportune moment that is decided by the conversa- is considered as an information filtering task that may
   tional information seeking system.                       have applications in system-initiative CIS systems.

  In other words, the first category contains the inter- Recommendation Recommender systems are often
actions that should be initiated instantly and are not considered as a subcategory of information filtering sys-
appropriate in other contexts. The second category, on
                                                                 3
                                                                   OMI interactions can also be triggered by the user at a conve-
                                                             nient time.
Table 1
Examples for various initiation purposes (rows) based on initiation moments (columns).

                              Instant Initiation                                Opportune Moment Initiation
 Filtering    streaming       Health and safety related information is of-      News agencies are constantly publishing
 information based on         ten time-sensitive. For instance, attacks or      new content on their website. Users, on the
 user profile                 events that may lead to a safety risk or haz-     other hand, have different preferences and
                              ard for the user should be instantly men-         tastes in the news topics and sources. A
                              tioned by a CIS system that is watching           system-initiative CIS system may initiate a
                              these streaming information sources.              conversation, based on the opportune mo-
                                                                                ment initiation scheme, to inform the user
                                                                                based on their preferences.
 Recommendation               Many users create and maintain to-do lists        Active engagement through CIS can also oc-
                              for their daily activities. A few recent rec-     cur in broad opportune moments like the
                              ommender systems have been developed to           pre-holidays. People are often known to ex-
                              re-rank and recommend the next to-do item.        change gifts during some special occasions
                              Some of the items in a to-do list can be time-    and holidays and a CIS could play an ac-
                              sensitive and a CIS system can instantly initi-   tive role in offering gifting recommendations
                              ate a conversation to notify the user that the    to the user. Such an active engagement
                              deadline for doing one of the yet-to-be-done      would be time-sensitive, and in addition to
                              tasks in the to-do list is approaching, other-    user-preferences for gift recommendations,
                              wise the user will not be able to complete the    a window-of-initiation would be equally as
                              task.                                             relevant.
 Following up a past          Any modification to the system’s response         CIS systems are not by any means perfect
 user-system conversa-        for a health or safety related question of the    and they make mistakes in responding to
 tion                         user which was asked in the past may need a       user’s requests. Based on new information
                              prompt conversation initiation. For instance,     or new models deployed in the system, a CIS
                              if the user asks about the number of daily        system may initiate a conversation at an op-
                              COVID-19 cases in an institute, and the sys-      portune moment to accept and correct its
                              tem responds with zero, it may need to in-        mistakes that was made in the past.
                              stantly initiate a conversation upon discov-
                              ering a new case in the day. (note that many
                              examples in this category also involve filter-
                              ing of streaming information, however such
                              filtering should happen with respect to the
                              past user-system interactions, which is dif-
                              ferent from the first row in this table.)
 Contributing      to a       While it is largely unexplored in the liter-      Similar to the previous case with a focus
 multi-party       human      ature, one possible use-case of a system-         on the monitored past human conversations
 conversation                 initiative CIS engagement in a human-             (i.e., following up a past human conversa-
                              human interaction could be that of monitor-       tion).
                              ing the factual accuracy of the underlying
                              content exchanged in human conversations
                              (if and where necessary). The CIS system
                              may engage in retrieval-based fact-checking
                              and initiate a conversation to contribute to
                              the ongoing human conversation by provid-
                              ing the fact-checking results and details.
 Feedback request             Asking for a location- and time-specific feed-    An example of an opportune moment feed-
                              back may need to happen promptly. For ex-         back request is that of e-commerce shop-
                              ample, while a user is driving and passing by     ping. Under the current popular systems,
                              a specific location, a CIS system may initi-      users often indiscriminately required to pro-
                              ate a conversation for feedback request by        vide reviews of products right after they pur-
                              asking about a car accident in that location.     chase them or after a pre-defined period of
                                                                                time. Factoring-in the category of products
                                                                                along with user meta-data could enhance a
                                                                                CIS’s ability to gauge what moments would
                                                                                be most opportune in terms of engaging an
                                                                                active conversation about seeking product
                                                                                feedback.
tems, however, we intentionally separate these two in Table 2
this paper to highlight their differences and important Notation descriptions.
applications in system-initiative CIS systems. Unlike in-
                                                           symbol description
formation filtering tasks that deal with a stream of data,
in this paper, recommendation tasks refer to recommend-    𝑢        the user
ing entities or information from an existing data source.  𝑝𝑡𝑢      the user profile and situational context associ-
For instance, recommending a restaurant based on the                ated with 𝑢 at timestamp 𝑡
                                                           𝑐𝑡𝑢      all the conversational interactions of 𝑢 with the
user’s location and preferences can be considered as a
                                                                    CIS system up to timestamp 𝑡
recommendation task but does not fit well within the
                                                           𝒞𝑡       The collection of all information items avail-
definition of information filtering tasks provided above.           able at timestamp 𝑡 (e.g., from the web)
CIS systems may initiate a conversation to make a rec-     𝑖        a system initiation instance object
ommendation to the user.                                   𝒟        a collection of system initiation instance ob-
                                                                        jects
Following up a past conversation A CIS system
may follow up a past conversation for many different
reasons, such as providing new information that was not can imagine a system that can ask for a permission to
available at the time of past conversation, correcting a initiate a conversation, for example via a light vibration.
mistake that was made by the system in a past conversa-
tion, and continuing a conversation that was interrupted
and left incomplete. System-initiation enables CIS sys- 3. A Pipeline for Conversation
tems to follow up past conversations to better serve their       Initiation in CIS
ultimate information seeking and access purpose.
                                                            As mentioned in the last section, information seeking
Contributing to a multi-party human conversation conversations can be initiated by new information, by the
Existing conversational information seeking systems are situational user context, and by new model deployment.
mainly designed for user-system interactions. However, In this section, we present a general high-level pipeline
CIS systems can contribute to multi-party human conver- for initiating a conversation in CIS systems. Due to the
sations, such as collaborative conversations. For instance, complexity of developing and evaluating the pipeline for
based on a conversation between two people, a CIS sys- system-initiative interactions, we additionally provide a
tem that is permitted to monitor the conversation may formal definition of each step. This formalization enables
contribute to the topic of the discussion, e.g., by fact- us to easily discuss evaluation methodologies for each
checking the claims made in the conversation and taking component in Section 6. It also helps future work to
an initiative if a false claim is made by one party.        see these steps in isolation. The pipeline is depicted in
                                                            Figure 1. It consists of the following steps that use the
Feedback request Feedback requests are not directly notation introduced in Table 2.
related to information seeking, however, user’s feedback,
such as product reviews, plays a key role in development Step I: Producing system initiation instances In
of several information seeking systems. On the other the first step, system initiation instances are produced
hand, users often forget or refuse to provide a feedback. by the processes described in Section 2.2, such as recom-
In some cases, a CIS system may initiate a conversation mendation and contributing to a multi-party conversa-
with the goal of collecting feedback about the user’s ex- tion. They are shown as initiation purposes in Figure 1.
periences. Such conversation may convince the users to These processes monitor the environment and produce
provide feedback in cases where they normally do not. instances that can lead to system initiation by observing
                                                            new filtered information or recommendation based on
2.3. Dimension III: Initiation Means                        the user’s context (see Section 2 for more detail about
                                                            these processes). The produced conversation initiation
How to initiate a conversation shapes the third dimen- instances will be added to the instances collection (or
sion in our conversation initiation taxonomy. In case database). Note that a system initiation instance is a data
of multi-device setting, the system should decide which object that contain all the information required for ini-
device should be used to initiate a conversation. Or in tiating a conversation, including the initiation purpose,
case of multi-modal setting, the system should decide the data, context, or reason that led to the production
which interaction channel (e.g., visual through a screen of the instance, the initiation features and content, etc.
or aural through the speaker) or processing modality This step can be formalized as a function of 𝑝𝑡 , 𝑐𝑡 , and
                                                                                                           𝑢 𝑢
(e.g., verbal through text or non-verbal through an im- 𝒞 𝑡 that produces one or more system initiation instances,
age) should be used for initiating a conversation. One
                             past user-system                user profile & context                             stream of
                               interactions                                                                    information
                                                                                                                                                                              user profile &
                                                                                                                                                                                 context
                                                                                                                                      …


                                                                                                                                          𝜙


                                     filtering streaming


                                                                                                                   feedback request
                                                                                           multi-party conv.
                                                           recommendation


                                                                                           contributing to a
                                                                                                                                                                                            𝜓


                                                                            conversation
                                                                             follow-up
                                             data
                                                                                                                                                       Instance                 initiator
                                                                                                                                                      collection


                                initiation purposes
                                                                                                                                                                          𝛾
                                                                                                                                                       conversation
                                                                                                                                                        generation

                                                                                                                                                    device and modality
                                                                                                                                                         selection
                                                                                                  interface


Figure 1: A generic pipeline for conversation initiation in CIS systems. Each initiation instance is a data object containing
all information required for initiating a conversation, including the initiation purpose, content, and context.


i.e., 𝜑(𝑝𝑡𝑢 , 𝑐𝑡𝑢 , 𝒞 𝑡 ).                                                                                                                    user profile and situational context, i.e., 𝛾(𝑖, 𝑝𝑡𝑢 ).

Step II: Selecting an instance for conversation ini-
tiation In the next step, the initiator component (see
                                                                                                                                              4. User Response to
Figure 1) selects one of the entries in the instances col-                                                                                       System-Initiative Conversations
lection 𝒟 for initiating a conversation. Although some
conversations need to be initiated promptly (i.e., instant                                                                                    While users are free to respond in any form they may
initiation), in this pipeline all instances are inserted into                                                                                 see fit, for a substantive functioning of the system we
𝒟 and this is the job of the initiator to promptly identify                                                                                   propose a certain categorization of responses based on
instant initiation requests. In more details, the initia-                                                                                     how they are processed by the system:
tor component is constantly monitoring all entries in
𝒞 and based on the user’s situational context decides                                                                                         • Null action: User provides no response to the initiated
what instance should be selected at each timestamp for                                                                                          conversation by the CIS system. Note that null action
conversation initiation. This step can be formalized as                                                                                         should not necessarily be interpreted as a negative feed-
𝜓(𝑖, 𝑝𝑡𝑢 ) = Pr(initiation = 1|𝑖, 𝑝𝑡𝑢 ) where 𝑖 is a system                                                                                     back, since the user may find some initiation useful,
initiation instance in 𝐷 and “initiation” is a binary hidden                                                                                    while they are not interested in further engagement.
variable representing the event of initiating the conver-                                                                                     • Interruption or negation: User provides a response con-
sation. Note that although everything mentioned in this                                                                                         sistent with the interpretation of shutting down any
paper is about system-initiative conversations, note that                                                                                       further engagement by the CIS system. Such response
the initiator can be also triggered by the user (for instance,                                                                                  can be safely assumed as a negative feedback.
the user may say “I’m board, tell me something”).
                                                         • Relevant response: User responds to the initiated con-
Step III: Conversation generation Once the initia-         versation by a relevant answer. This is often expected
tor component selects one of the instances from 𝒟, a       to happen when the initiated conversation involves a
natural language utterance will be generated by the con-   question or asks for feedback.
versation generation component and it will be presented
                                                         • Postpone: User responds to the initiated conversation
to the user based on an appropriate device and interac-
                                                           and asks the system to remind them at a later time.
tion modality (in case of multi-device or multi-modal
settings). Therefore, this step can be formally defined • Critique or clarification-seeking response: The kind of
as a function that generates a conversation based on a     responses here would include users further engaging
given instance 𝑖 and presented to the user based on the
  in a back-and-forth conversation with agent about ei-        goal is either filtering of streaming information, recom-
  ther seeking further information or critiquing existing      mendation, conversation follow-up, contributing to a
  engagement. One key technical challenge here that we         multi-party conversation, or feedback request.
  talk about in the next section would be the processing
  of the user response in order to inculcate it to make        5.2. Developing an Initiator Model
  the system better.
                                                            The second step in the provided pipeline is selecting a
• Follow up: User responds with a follow up response        system-initiative instance from the instance collection 𝒟
  to get further information or perform actions related     by an initiator component (see Figure 1). This is indeed
  to the initiated conversation.                            equivalent with implementing the function 𝜓(𝑖, 𝑝𝑡𝑢 ). We
                                                            believe that the most challenging part of implementing
• Topic drift: User responds but changes the topic of the
                                                            such component is our lack of knowledge on what is
   initiated conversation.
                                                            generally the right moment for initiating a conversation.
   Given the current status of text classification models   Therefore,  we believe that future research should focus
and the complexity of the task, it is possible to achieve on conducting user studies in the wild to explore what
an acceptable classification accuracy in classifying user are the right time to initiate a conversation. Some weak
responses to the above categories. They can be further supervision signals can be mined from user interactions
used for training or evaluating the conversation initiation with the current conversational systems, even if they do
process. For instance, interruption or negation may be not support system-initiative interactions. For example,
considered as negative feedback. Such feedback can be the times when the user initiate an unimportant conver-
used to modify the models deployed for each of the three sation (due to being board for example) can provide a
steps (𝜑, 𝜓, and 𝛾) in the pipeline (see Section 3). On weak (noisy) signal as a potentially good time to initiate
the other hand, receiving a relevant response may be a conversation and thus machine learning based mod-
considered as a positive feedback for the system.           els can be trained based on the situational context and
                                                            the user profile to predict such moments. Of course, a
                                                            nice property of interactive systems that log the user
5. Technical Challenges                                     interactions is to iteratively improve the system ability to
                                                            accurately predict such moments based on the feedback
In this section, we hypothesize certain key technical chal- received from the user (see Section 4 for various types or
lenges in implementing the pipeline described in Sec- user responses to system-initiative interactions).
tion 3.
                                                               5.3. Generating System-Initiative
5.1. Producing System-Initiative                                    Utterances
     Instances
                                                               The third and the final step in our pipeline (Section 3) is
The first step in the system-initiation pipeline is to iden-   to generate a conversation based on a system-initiative
tify reasons for initiating a conversation and generate        instance and present it to the user, equivalent to imple-
a system-initiative instance. As described in the last         menting the function 𝛾(𝑖, 𝑝𝑡𝑢 ). We believe that many
section, system-initiative instances are data objects that     techniques developed in the dialogue systems and text
contain all the information about a system-initiative con-     generation research can be used for implementing this
versation, such as the purpose, content, and context.          component. Each instance 𝑖 is a structured data object,
This step can be cast to implementing each of the five         therefore, neural models for unstructured text genera-
initiation purpose components discussed in Section 2.2.        tion from structured data, e.g., tables, can be potentially
In other words, one needs to implement the function            adopted. Since the users mostly do not expect system-
𝜑(𝑝𝑡𝑢 , 𝑐𝑡𝑢 , 𝒞 𝑇 ) with a focus on each initiation purpose.   initiative utterances, an interesting technical challenge
This has roots in various IR tasks, such as filtering and      here would be providing some context in the generated
recommendation. However, some of the initiation pur-           utterance to make sure that the user understands why
poses are relatively unstudied in the literature, such as      such conversation being initiated. This context may refer
following up a past conversation or contributing to a          to a previous interaction of the user with the system, a
multi-party conversation. Even feedback request in the         past experience of the user, or an explanation on the rea-
form of active conversation is underexplored. There-           son that led to the generation of such system-initiative
fore, one of the major technical challenges in producing       conversation.
system-initiative instances is to develop models that can
identify the reasons for conversation initiation when the
6. Evaluating System-Initiative                             information related to the user’s profile and context (e.g.,
                                                            time and location), formalized as 𝜓(𝑖, 𝑝𝑡𝑢 ) in Section 3.
   CIS Systems                                              This approach assumes that the importance of initiation
Evaluation is one of the most challenging aspect of moment is binary (good or bad). However, this is not the
system-initiative CIS systems. IR research has a long case. Two situations may be bad for initiating a conver-
history of collection creation for various information sation, but one may be the worst. Therefore, we believe
seeking tasks, however, they are mostly created based on that this task should be evaluated as a ranking task: re-
a set of pre-defined information needs (e.g., most TREC rank the list of situational context information associated
tracks) or a set of observations (e.g., clickthrough data). with a user given for an initiation command. This setting
Such evaluation methodologies do not easily extend to allows us to have multi-level (or graded) labels for each
an active interaction scenarios, such as system-initiation situational context and use relevant metrics (e.g., similar
in conversation.                                            to NDCG [16]) to evaluate the quality of the system in
   Although evaluating system-initiative CIS systems is identifying the right situation (moment) to initiate the
yet to be explored in the literature, in this section, we conversation.
detail our perspective on potential evaluation method-
ologies that can be pursued.                                6.3. Evaluating the Content of Initiated
                                                                      Conversations (𝛾 )
6.1. Evaluating system initiation                               After identifying conversation initiation commands, the
     instances (𝜑)                                              system needs to produce a natural language sentence or
As pointed out in Section 3, the first step towards initi-      utterance (in most cases) and select the initiation means
ating a conversation is to produce system initiation in-        if needed for initiating the conversation. In Section 3,
stanced formalized using a function 𝜑(𝑝𝑡𝑢 , 𝑐𝑡𝑢 , 𝒞 𝑇 ). The    we formalize this as 𝛾(𝑖, 𝑝𝑡𝑢 ). To evaluate this ability of
initiation command should include all information about         the system, we can assume that the initiation commands
the nature of conversation initiation. To evaluate this         are accurately produced and are complete (i.e., a hypo-
component, we should provide all the required informa-          thetically ideal system with perfect precision and recall).
tion at the timestamp 𝑡 to the system as input and evalu-       Based on this assumption, the focus of this evaluation
ate the produced initiation command. The required infor-        step would be to generate a conversation utterance based
mation (as depicted in Figure 1) includes past user-system      on a given initiation instance 𝑖. The generated utter-
interactions, user profile, user situational context, and a     ance should contain all required information in addition
stream of new information content. The model should             to being precise and fluent. A number of ground truth
produce an initiation command or NUll, meaning that no          reference utterances can be generated through manual
initiation is needed. Both precision and recall-oriented        annotation and popular text generation metrics such as
metrics should be used to evaluate the model’s perfor-          BLEU [17], ROUGE [18], and BERTScore [19] may be
mance. In fact, the produced initiation instances should        used to evaluate the model. As discussed in [20], despite
be of high quality (all initiations should be relevant to the   the popularity of these metrics, they do not necessarily re-
user) with high coverage (all required initiations should       flect the quality of the produced dialogue, and ultimately
be produced by the model).                                      human annotation of the model’s outputs is desired.
   Based on this evaluation methodology, a reusable col-
lection can be created. The data collection can be either       6.4. End-to-End Evaluation of
sampled from a real user’s interaction history (a realistic          System-Initiated Conversations
setting, but requires access to real user-system conversa-
tional interactions), or constructed based on information  The last three subsections discuss component-level eval-
seeking interactions between two or more people. The       uation of system-initiative CIS systems. As mentioned
latter can be done in a lab study using a wizard of oz     above, each is based on some simplifying assumptions
setting, similar to [15].                                  of other components of the system, which is unrealistic.
                                                           Therefore, an end-to-end evaluation of system-initiated
                                                           conversations should be explored. To do so, both offline
6.2. Evaluating Initiation Moments (𝜓 )                    and online evaluation strategies can be adopted. For
To evaluate the initiator component in the proposed offline evaluation, each instance would include all the
pipeline (see Figure 1), one can cast the problem to a required information for the system at a timestamp 𝑡
binary classification task. In more detail, we can formal- as input, including past user-system interactions, user
ize the task as predicting whether to initiate the conver- profile, situational context, and a stream of new informa-
sation or not given an initiation instance and a set of tion. The model will be evaluated based on the produced
system-initiated conversations (if needed). Having a sin-     own. For example, one lingering question could be, how
gle evaluation metric that can reflect all aspect of con-     should we derive user consent of all the humans involved
versation initiation evaluation would be challenging and      whose data the system processes? Fully studying privacy
require further investigation. Approaches like economic       implications of such a system would require extensive
models of interactive information retrieval that model the    user studies, often on a task-by-task basis, and assessing
system by assigning cost and benefit to each interaction      perception of the system behavior itself on the end-users.
may be relevant. In case of online evaluation, the typical
A/B tests can be used to evaluate the system, and the         Badly Timed Engagements Arguably one of the
system can be evaluated by interpreting the positive and      most important components of an active engagement
negative feedback received from the user. Such feedback       CIS would be its initiator decision making system that
can be obtained by identifying the user response type         decides when to initiate a conversation and perhaps more
(see Section 4).                                              importantly, when not to initiate one. Engagements made
                                                              by the system at a bad time can be counter-productive or
6.5. End-to-End Evaluation of                                 even downright dangerous. For example, while initiating
     Mixed-Initiative Conversations                           a non time-sensitive conversation, the agent must not
                                                              disturb or distract the user in any way. An unexpected
System-initiated conversations are just one type of inter-    system engagement when the user is engaged in a crit-
actions that a mixed-initiative CIS system may support.       ical activity, e.g. driving, can be extremely dangerous
There exist several other interactions, such as the typical   and distracting. Therefore, while developing an initiator
user-initiated information seeking conversations and ask-     module we must also account for actively penalizing the
ing clarifying questions for intent disambiguation [21].      module if it engages at a particularly bad time.
The ultimate evaluation methodology should assess the
quality of the system in all of these different settings.
Such complex end-to-end evaluation can be again done          8. Broader Impact
using both online and offline evaluation using a data that
                                                            This paper highlighted various real-world applications of
contains all different sorts of interactions. Similar ap-
                                                            system-initiative interactions in conversational informa-
proaches as the one mentioned in the last subsection can
                                                            tion seeking systems. The authors believe that research
be adopted, however, designing an evaluation metric for
                                                            progress in modeling and evaluating system-initiative
this purpose would be even more challenging.
                                                            CIS can potentially lead to a broader impact. Several
                                                            health and safety related conversations can be initiated
7. Dangers of System-Initiative                             by CIS systems to warn users of potential harms and
                                                            hazards. Such system-initiative interactions can be trig-
      Interactions                                          gered based on the user’s situational context, such as
Privacy Concerns Even with existing conversational location or health-related signals captured by various
information systems, users often have privacy concerns sensors embedded into smartphones and wearable de-
about how their information is processed, to whom it vices. Furthermore, these systems can potentially inform
is disclosed and what is the associated risk [22]. We en- the victims of misinformation or abusive content which
vision that those concerns will only be exacerbated, if targets the users through human conversations, written
left unaddressed, with a system capable of processing far documents, or ads. Different types of entertainment can
more sensitive user information and engaging in an ac- also be an application of system-initiative interactions,
tive form of conversation. Hence we believe that certain which can be or not be related to information seeking.
privacy concerns must be addressed while designing and         Moreover, with the progress of virtual and augmented
implementing active CIS systems. Secure information         reality devices, system-initiative interactions (especially
retrieval and data sharing protocols would be needed to     those  with the information seeking nature) would be of
safeguard and ensure users that their identifiable infor- great importance, since the user can experience a virtual
mation remains secure. Ensuring and safeguarding user environment and a system-initiative CIS can guide the
information may or may not instill a sense of security users as they are exploring the virtual environment.
among the end users if the activity format of the under-
lying system comes off as too intrusive. For instance, 9. Related Work
one of the use cases for an active engagement CIS is
that of contributing to a multi-party human conversation The study of interaction has a long history in information
(Table 1). Active system engagement in a multi-party retrieval research, starting in the 1960s [23]. Much of the
setting has been a largely unexplored area in the IR liter- earlier research studied how users interacted with inter-
ature and raises new and unique privacy concerns of its mediaries during information seeking dialogues but this
rapidly shifted to studying how users interacted with op-      and push notifications in desktop and mobile apps (Sec-
erational retrieval systems, including proposals for how       tion 9.6). In the following, we present an overview of
to improve the interaction. Information retrieval systems      these related domains to position our work in context.
based on this research were also implemented. Oddy
[24] developed an interactive information retrieval sys-       9.1. Mixed-Initiative Interactions
tem with rule-based dialogue interactions in 1977. Croft
and Thompson [25] later proposed the first interactive         Most approaches to human-computer interactions with
information retrieval system that models user, I3 R, using     intelligent systems are either controlled by human or
a mixture of expert architecture. A few years later [26]       system. However, developing intelligent systems that
characterized information seeking strategies for interac-      support mixed-initiative interactions has always been de-
tive IR, offering users choices in a search session based      sired. Allen et al. [7] believed that development of mixed-
on case-based reasoning.                                       initiative intelligent systems will ultimately revolution-
   Since the development of web search engines, research       ize the world of computing. Mixed-initiative interac-
has mostly focused heavily on understanding user in-           tions in dialogue systems have been explored since the
teraction with search engines based on an analysis of          1980s [34, 35, 36]. Early attempts to build systems that
the search logs available to commercial search engine          support mixed-initiative interactions include the Look-
providers. Since then, explicit modeling of information        Out system [37] for scheduling and meeting manage-
seeking dialogues or conversations with the aim of im-         ment in Microsoft Outlook, Clippit4 for assisting users
proving the effectiveness of retrieval has not been a focus    in Microsoft Office, and TRIPS [38] for assisting users in
of research until recently. One exception is the TREC          problem solving and planning.
Session Track [27] that focused on the development of             Horvitz [37] identified 12 principles that systems with
query formulation during a search session and improv-          mixed-initiative user interfaces must follow. In summary,
ing retrieval performance by incorporating knowledge of        mixed-initiative interactions should be taken at the right
the session context. On the other hand, commercial per-        time in the light of cost, benefit, and uncertainties. Many
sonal assistants such as Apple Siri and Google Assistant       factors can impact cost and benefit of interactions that are
have become commonplace and there is a clear incen-            covered in multiple principles. In addition, systems with
tive to develop better conversational models for search.       mixed-initiative interactions should put user at the cen-
A promising development has been the effectiveness of          ter and allow efficient invocation and termination. They
neural models for generating conversational responses          are expected to memorize past interactions and contin-
when trained on large amounts data (e.g., [28]).               uously learn by observation. Based on these principles,
   In recent years, conversational information seeking         conversational systems by nature raise the opportunity
systems have attracted attention in both academia and          of mixed-initiative interactions.
the industry [1, 29]. They include conversational search,         Allen et al. [7] defined four levels of mixed-initiative
recommendation, and question answering systems. CIS            interactions in the context of dialogue systems, as fol-
systems are sufficiently broad to cover a wide range           lows:
of tasks. The research community has so far studied
                                                               1. Unsolicited reporting: An agent notifies others of
a number of them, including conversational answer re-
                                                                  critical information as it arises. For example, an agent
trieval [2], conversational answer extraction (often re-
                                                                  may constantly monitor the progress for the plan un-
ferred to as conversational question answering) [3], con-
                                                                  der development. In this case, the agent can notify
versational query re-writing [30], next question pre-
                                                                  the other agents (e.g., user) if the plan changes.
diction [31], speech-only interfaces for conversational
search [32], and question-based recommendation (often          2. Subdialogue initiation: An agent initiates subdia-
referred to as conversational recommendation) [33]. In            logues to clarify, correct, and so on. For example,
all of these tasks, the user initiates the conversation with      in a dialogue between a user and a system, the sys-
the CIS system and the system responds. Even in case              tem may ask a question to clarify the user’s intent.
of existing conversational recommender systems, the               Since the system asks the question and the user should
conversations are initiated by users [33]. In this work,          answers the question, and clarification may take mul-
however, we discuss challenges and possible solutions for         tiple interactions, the system has temporarily taken
extending existing models to support system-initiative            the initiative until the issue is resolved. This is why it
conversations.                                                    is called subdialogue initiation.
   System-initiative conversations are indeed related to
mixed-initiative interactions (Section 9.1). There are         3. Fixed subtask initiation: An agent takes initiative
some other related research directions that may be out-           to solve predefined subtasks. For example, if an agent
side of the IR community, including dialogue acts (Sec-           is supposed to complete a task that involves multiple
tion 9.2), system-initiative dialogue systems (Section 9.3)        4
                                                                       https://en.wikipedia.org/wiki/Office_Assistant
   subtasks. In this case, the agent can takes initiative Some of the personalization methods leverage long-term
   to ask questions and complete the subtask. Once the    user behavioral histories [45], while others analyze short-
   subtask is completed, initiative reverts to the user.  term implicit feedback [46]. A key challenge that we see
                                                          with personalization, especially when applied to active-
4. Negotiated mixed-initiative: Agents coordinate
                                                          engagement conversational systems, is that of collecting
   and negotiate with other agents to determine initia-
                                                          user profiles with sufficiently rich features, while bal-
   tive. This is mainly defined for multi-agent systems
                                                          ancing privacy concerns. We leave that as an essential
   in which agents decide whether they are qualified to
                                                          component of our future work.
   complete a task or it should be left for other agents.
   When it comes to open-domain conversational infor-         9.3. Initiative Control in Dialogue
mation seeking, some of these mixed-initiative levels
                                                                   Systems
remain valid. Mixed-initiative conversational informa-
tion seeking has relatively less explored, nevertheless       Discourse segmentation through transfer of control in
identified as critical components of a conversational sys-    dialogue systems was first studied decades ago by Walker
tem [6, 39]. Perhaps clarification [8, 40, 10] and pref-      and Whittaker [47] to enable flexible human-computer
erence elicitation [11, 12] are the two areas related to      conversations to take place that allow for corrections
mixed-initiative interactions that have attracted much        and clarifications. Since then a number of studies have
attention in recent years. However, they are mostly           been done to determine the ideal behavior of a virtual
unrelated to system-initiative interactions, which are        assistant [48, 49]. One of the key aspects of such an ideal
relatively unexplored. Nevertheless, the unsolicited re-      behavior is initiative. In prior work, a number of authors
porting level of mixed-initiative interactions mentioned      have considered what constitutes initiative [50, 51, 52].
above include several interesting example use-cases for       Instead of a human, at certain relevant points in time,
system-initiative CIS systems.                                the system may take the initiative to engage in a con-
                                                              versation. Among all such systems, there is some form
9.2. Dialogue Acts in Conversational                          of dialogue management component which determines
                                                              what to prompt for and/or what to accept next based
     Systems                                                  on the conversation history and its context [53]. Such a
Spoken dialogue systems (SDS) have allowed for interac-       management component plays a central role in the tradi-
tion with computer-based applications (e.g., smart speak-     tional architecture of a dialogue system and is primarily
ers) through spoken natural language. Certain SDS mech-       concerned with the flow of the dialogue (information
anisms are specifically designed to carry out well-defined    providing, feedback request, etc) while simultaneously
tasks, e.g., scheduling [41], and most of them are based      maintaining a discourse history. For instance, Vakulenko
on a finite state-based dialogue control. Although the        et al. [54] have shown how an agent might effectively
focus of CIS research is mostly on open-domain informa-       take initiative to elicit or clarify information when ap-
tion seeking tasks, such dialogue acts can be potentially     propriate.
used to support a diverse set of modes and scenarios in          In addition to standalone engagement by a conver-
system-initiative CIS systems. A range of prior studies in    sational agent, studies have also shown that sources of
dialogue acts also offer insights into designing models for   information that led to that engagement are equally im-
conveying information through conversations e.g. prior        portant – e.g., different sources have varying influence
work by Bunt et al. [42] offers promising features de-        on purchase decisions, implying that the effectiveness
rived from broad dialogues to better model information        of a conversational information system depends on the
needs, however in our work we assume that an alternate        system saying why it made a specific decision or recom-
method might be required to extract similar information-      mendation [28, 55].
needs from user meta-data. Dialogue acts can potentially
serve as communicative functions of dialogue segments,        9.4. Contextual Suggestions
such as request, inform, question, suggest and offer. The
general taxonomy of dialogue acts is complex with dif-        Contextual suggestion track within TREC [56] is aimed
ferent markup schemes. One segment of particular in-          at providing personalized point-of-interest recommen-
terest to us, and often not examined in the IR literature,    dations to users in a ranked manner. The task assumes
is that of turn taking [43]. For instance, our conversa-      a certain setting – a user in a specific place (geographic
tional agent will have control over the dialogue and the      location) with a trip type. Given the same user’s personal
segments might be produced through an analysis of user        profile (interests, endorsements etc), the system makes
data. Such an analysis over user meta-data (e.g. Location)    recommendations for attractions. The track consisted of
to personalize an IR task isn’t new, and is most commonly     two phases, Phase 1: participants could select any venue
applied in the context of personalized web-searches [44].     from the reference collection. Phase 2: participants had to
rank a given list of venues for each user and thus allowing    9.7. Information Need in Collaborative
for ground truth data against which the system could be             Conversations
evaluated. Early works on this task involved rule-based
approaches by mapping user-profiles to specific venues.        Over time, a number of definitions for information need
Recently, people have experimented with standard ma-           have been conceptualized [70, 71]. For our work, we
chine learning [57] and neural methods [58, 59] for best       consider the one by Case [72] i.e. information need is
mapping user profiles to relevance-rated documents. For        a recognition that the user’s knowledge is inadequate to
example, Seyler et al. [58] create graph embeddings from       satisfy their own goals, as it implies that the information
a heterogeneous information network (HIN) using the            need must emerge from the user’s end. Collaborative con-
TREC Contextual Suggestion dataset achieving state-of-         versations offer one such instance, as articulated by Shiga
the-art performance.                                           et al. [73], that information needs in such conversations
                                                               are naturally verbalized and therefore can be captured
                                                               by end-user devices. Furthermore, we can utilize the tax-
9.5. Incident Streams                                          onomy of information needs defined by Taylor [74] to
TREC-IS track in 2018 [60] focused on curating feeds of        differentiate between perceived needs and actual queries
social media posts and classify them based on actionable       since Taylor’s model consists of visceral, conscious, for-
information for enhanced situational awareness (such as        malized and compromised needs. For the purposes of a
emergencies). Incident streams are relevant in context         conversational information system to actively engage in
of system-initiative conversations due to the underlying       a collaborative discussion, we primarily focus on the con-
nature of the task – analyze large sets of textual infor-      scious needs, which are defined as “ambiguous and ram-
mation related to user profiles and act in a time-sensitive    bling statement” but ultimately evolve into formalized
manner. In addition to the type of information, TREC-IS        needs (qualified and rational). Prior work by Jansen et al.
evaluation tasks also include a criticality-score indicat-     [75] on analyzing conversation query logs has shown that
ing how important it is for a specific content to be acted     users often frame a short and under-specified query to
upon.                                                          information seeking systems, however community-based
                                                               modern QA models are often capable to formalizing such
                                                               information needs on QA sites or speech-oriented search
9.6. Push Notifications                                        systems. The degree of interest in collaborative conver-
Push notifications have been mostly studied in the con-        sational information search has increased since then and
text of mobile applications, largely with e-commerce           has led to quantitative analysis of conversations during
goals [61, 62, 63]. Much of the prior research on push no-     search [76, 77, 78]. For example, Foster [79] performed
tifications has focused on their disruptive nature [64, 65].   a full discourse analysis on group conversations to de-
For example, Mehrotra et al. [66] provided an in-depth         fine the relationship between functions of verbal context
study evaluating how the user-response time of a non           and information seeking activity. While these studies
time sensitive notification is influenced by the notifi-       remain either conceptual or use small amounts of textual
cation’s presentation, modality as well as the sender-         chat data, they nevertheless suggest that collaborative
recipient relationship. Mehrotra et al. [67] further de-       conversations can be a useful source for conversational
tailed, through extensive user studies, how push noti-         information seeking.
fications with different context and timings can cause            In this research, we also highlight some applications
disruptions. This is an especially important component         of system-initiative conversational interactions in the
since one of the main goals of an active engagement            context of multi-party and collaborative conversations.
system is to minimize disruptions caused to the end-
user. Other works in the area have explored the use of
push notifications for meta-learning [68] and self-logging     10. Summary
[69] to better adapt the underlying framework for adjust-
                                                               In this work, we explored applications and the ways to
ing user preferences. Push notifications are basically
                                                               model an active engagement conversational information
system-initiative interactions, however they mostly do
                                                               seeking system. We defined a taxonomy upon which a
not concern with information seeking tasks and are fun-
                                                               framework for an active engagement system could be
damentally different from system-initiative interactions
                                                               built. Our taxonomy defines three broad dimensions of
in conversational systems.
                                                               an active engagement framework – initiation moment
                                                               (when to initiate a conversation), initiation purpose (why
                                                               to initiate a conversation) and interaction means (how to
                                                               initiate a conversation). Subsequently we show, through
                                                               the explained examples and a pipeline, how the described
characteristics are both necessary and sufficient to allow    [6] F. Radlinski, N. Craswell, A theoretical framework
for the functioning of an active engagement informa-              for conversational search, in: Proceedings of the
tion seeking system, for a number of initiation purposes.         2017 Conference on Conference Human Informa-
In doing so, we also generalized several components of            tion Interaction and Retrieval, CHIIR ’17, ACM,
the pipeline that have been implemented before with               New York, NY, USA, 2017, p. 117–126.
proven effectiveness. We view the contribution of our         [7] J. E. Allen, C. I. Guinn, E. Horvitz, Mixed-initiative
work as this taxonomy and the proposed generic pipeline,          interaction, IEEE Intelligent Systems and their Ap-
which can be employed towards building true active-               plications 14 (1999) 14–23.
engagement information seeking systems. Implementing          [8] M. Aliannejadi, H. Zamani, F. Crestani, W. B.
and evaluating the proposed framework in a user-centric           Croft, Asking clarifying questions in open-domain
way remains the most important future directions sug-             information-seeking conversations, in: Proceed-
gested by this work. It is further worth considering the          ings of the 42nd International ACM SIGIR Confer-
the numerous technical and evaluation challenges that             ence on Research and Development in Information
come with the proposed approach. Finally, we believe              Retrieval, SIGIR ’19, ACM, New York, NY, USA,
that our identified use cases of active engagement CIS            2019, p. 475–484.
systems would only serve as founding basis for other,         [9] H. Zamani, M. Bendersky, X. Wang, M. Zhang, Sit-
broader case-specific applications such has aiding peo-           uational context for ranking in personal search,
ple with disabilities, as highlighted in section 8 and we         in: Proceedings of the 26th International Confer-
hope this work spurs additional work in this largely-             ence on World Wide Web, WWW ’17, International
unexplored area of information seeking research.                  World Wide Web Conferences Steering Commit-
                                                                  tee, Republic and Canton of Geneva, CHE, 2017, p.
                                                                  1531–1540.
Acknowledgments                                              [10] H. Zamani, B. Mitra, E. Chen, G. Lueck, F. Diaz, P. N.
                                                                  Bennett, N. Craswell, S. T. Dumais, Analyzing and
This work was supported in part by the Center for In-
                                                                  learning from user interactions for search clarifica-
telligent Information Retrieval. Any opinions, findings
                                                                  tion, in: Proceedings of the 43rd International ACM
and conclusions or recommendations expressed in this
                                                                  SIGIR Conference on Research and Development in
material are those of the authors and do not necessarily
                                                                  Information Retrieval, SIGIR ’20, ACM, New York,
reflect those of the sponsors.
                                                                  NY, USA, 2020, p. 1181–1190.
                                                             [11] F. Radlinski, K. Balog, B. Byrne, K. Krishnamoorthi,
References                                                        Coached conversational preference elicitation: A
                                                                  case study in understanding movie preferences, in:
 [1] J. S. Culpepper, F. Diaz, M. D. Smucker, Research            Proceedings of the 20th Annual SIGdial Meeting on
     frontiers in information retrieval: Report from the          Discourse and Dialogue, ACL, Stockholm, Sweden,
     third strategic workshop on information retrieval            2019, pp. 353–360.
     in lorne (swirl 2018), SIGIR Forum 52 (2018) 34–90.     [12] A. Sepliarskaia, J. Kiseleva, F. Radlinski, M. de Ri-
 [2] C. Qu, L. Yang, C. Chen, M. Qiu, W. B. Croft,                jke, Preference elicitation as an optimization prob-
     M. Iyyer, Open-Retrieval Conversational Question             lem, in: Proceedings of the 12th ACM Conference
     Answering, SIGIR ’20, ACM, New York, NY, USA,                on Recommender Systems, RecSys ’18, ACM, New
     2020, p. 539–548.                                            York, NY, USA, 2018, p. 172–180.
 [3] S. Reddy, D. Chen, C. D. Manning, CoQA: A con-          [13] N. J. Belkin, W. B. Croft, Information filtering and
     versational question answering challenge, Transac-           information retrieval: Two sides of the same coin?,
     tions of the Association for Computational Linguis-          Commun. ACM 35 (1992) 29–38.
     tics 7 (2019) 249–266.                                  [14] S. Robertson, D. A. Hull, The TREC-9 Filtering
 [4] Y. Sun, Y. Zhang, Conversational recommender                 Track Final Report, in: Proceedings of Ninth Text
     system, in: The 41st International ACM SIGIR Con-            REtrieval Conference, TREC-9, National Institute
     ference on Research & Development in Information             of Standards and Technology, Special Publication,
     Retrieval, SIGIR ’18, ACM, New York, NY, USA,                2000, pp. 25–40.
     2018, p. 235–244.                                       [15] A. Weiss, R. Bernhaupt, D. Dürnberger, M. Alt-
 [5] P. Thomas, D. McDuff, M. Czerwinski, N. Craswell,            maninger, R. Buchner, M. Tscheligi, User experience
     Misc: A data set of information-seeking conver-              evaluation with a wizard of oz approach: Technical
     sations, in: Proceedings of the 1st International            and methodological considerations, 2010, pp. 303 –
     Workshop on Conversational Approaches to Infor-              308.
     mation Retrieval, CAIR ’17, 2017.                       [16] K. Järvelin, J. Kekäläinen, Cumulated gain-based
                                                                  evaluation of ir techniques, volume 20, ACM, New
     York, NY, USA, 2002, p. 422–446.                              network approach to context-sensitive generation
[17] K. Papineni, S. Roukos, T. Ward, W.-J. Zhu, Bleu: a           of conversational responses, in: Proceedings of the
     method for automatic evaluation of machine trans-             2015 Conference of the North American Chapter
     lation, in: Proceedings of the 40th Annual Meeting            of the Association for Computational Linguistics:
     of the Association for Computational Linguistics,             Human Language Technologies, NAACL ’15, ACL,
     ACL, Philadelphia, Pennsylvania, USA, 2002, pp.               Denver, Colorado, 2015, pp. 196–205.
     311–318.                                                 [29] A. Anand, L. Cavedon, H. Joho, M. Sanderson,
[18] C.-Y. Lin, ROUGE: A package for automatic eval-               B. Stein, Conversational search (dagstuhl semi-
     uation of summaries, in: Text Summarization                   nar 19461), Schloss Dagstuhl-Leibniz-Zentrum für
     Branches Out, ACL, Barcelona, Spain, 2004, pp. 74–            Informatik, 2020.
     81.                                                      [30] S. Yu, J. Liu, J. Yang, C. Xiong, P. Bennett, J. Gao,
[19] T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger,                Z. Liu, Few-shot generative conversational query
     Y. Artzi, Bertscore: Evaluating text generation with          rewriting, in: Proceedings of the 43rd International
     bert, ArXiv abs/1904.09675 (2020).                            ACM SIGIR Conference on Research and Devel-
[20] N. Mathur, T. Baldwin, T. Cohn, Tangled up in                 opment in Information Retrieval, SIGIR ’20, ACM,
     BLEU: Reevaluating the evaluation of automatic                New York, NY, USA, 2020, p. 1933–1936.
     machine translation evaluation metrics, in: Pro-         [31] L. Yang, H. Zamani, Y. Zhang, J. Guo, W. Croft, Neu-
     ceedings of the 58th Annual Meeting of the Associ-            ral matching models for question retrieval and next
     ation for Computational Linguistics, ACL, Online,             question prediction in conversation, in: Proceed-
     2020, pp. 4984–4997.                                          ings of the 2017 ACM SIGIR Workshop on Neural
[21] E. Pitler, K. Church, Using word-sense disambigua-            Information Retrieval, NeuIR @ SIGIR ’17, 2017.
     tion methods to classify web queries by intent, in:      [32] J. R. Trippas, Spoken conversational search: Speech-
     Proceedings of the 2009 Conference on Empirical               only interactive information retrieval, in: Proceed-
     Methods in Natural Language Processing, EMNLP                 ings of the 2016 ACM on Conference on Human
     ’09, ACL, Singapore, 2009, pp. 1428–1436.                     Information Interaction and Retrieval, CHIIR ’16,
[22] S. Zimmerman, A. Thorpe, C. Fox, U. Kruschwitz,               ACM, New York, NY, USA, 2016, p. 373–375.
     Investigating the interplay between searchers’ pri-      [33] W. Lei, X. He, M. de Rijke, T.-S. Chua, Conver-
     vacy concerns and their search behavior, in: Pro-             sational Recommendation: Formulation, Methods,
     ceedings of the 42nd International ACM SIGIR Con-             and Evaluation, SIGIR ’20, ACM, New York, NY,
     ference on Research and Development in Informa-               USA, 2020, p. 2425–2428.
     tion Retrieval, SIGIR ’19, ACM, New York, NY, USA,       [34] H. Kitano, C. Van Ess-Dykema, Toward a plan-
     2019, p. 953–956.                                             based understanding model for mixed-initiative dia-
[23] D. Kelly, C. R. Sugimoto, A systematic review of              logues, in: Proceedings of the 29th Annual Meeting
     interactive information retrieval evaluation studies,         on Association for Computational Linguistics, ACL
     1967-2006, J. Assoc. Inf. Sci. Technol. 64 (2013) 745–        ’91, ACL, USA, 1991, p. 25–32.
     770.                                                     [35] D. G. Novick, S. A. Douglas, Control of Mixed-
[24] R. Oddy, Information retrieval through man-                   Initiative Discourse through Meta-Locutionary
     machine dialogue, Journal of Documentation 33                 Acts: A Computational Model, Technical Report,
     (1977) 1–14.                                                  USA, 1988. AAI8911322.
[25] W. B. Croft, R. H. Thompson, I3 r: A new approach        [36] M. Walker, S. Whittaker, Mixed initiative in dia-
     to the design of document retrieval systems, JASIS            logue: An investigation into discourse segmenta-
     38 (1987) 389–404.                                            tion, in: Proceedings of the 28th Annual Meeting
[26] N. J. Belkin, C. Cool, A. Stein, U. Thiel, Cases,             on Association for Computational Linguistics, ACL
     scripts, and information-seeking strategies: On the           ’90, ACL, USA, 1990, p. 70–78.
     design of interactive information retrieval systems,     [37] E. Horvitz, Principles of mixed-initiative user inter-
     Expert Systems with Applications 9 (1995) 379–395.            faces, in: Proceedings of the SIGCHI Conference
[27] B. Carterette, P. Clough, M. Hall, E. Kanoulas,               on Human Factors in Computing Systems, CHI ’99,
     M. Sanderson, Evaluating retrieval over sessions:             ACM, New York, NY, USA, 1999, pp. 159–166.
     The trec session track 2011-2014, in: Proceedings        [38] G. Ferguson, J. F. Allen, TRIPS: an integrated intel-
     of the 39th International ACM SIGIR Conference                ligent problem-solving assistant, in: Proceedings
     on Research and Development in Information Re-                of the Fifteenth National Conference on Artificial
     trieval, SIGIR ’16, ACM, New York, NY, USA, 2016,             Intelligence and Tenth Innovative Applications of
     pp. 685–688.                                                  Artificial Intelligence Conference, AAAI ’98, IAAI
[28] A. Sordoni, M. Galley, M. Auli, C. Brockett, Y. Ji,           ’98, AAAI Press / The MIT Press, 1998, pp. 567–572.
     M. Mitchell, J.-Y. Nie, J. Gao, B. Dolan, A neural       [39] J. R. Trippas, D. Spina, L. Cavedon, H. Joho,
     M. Sanderson, Informing the design of spoken con-            plications 14 (1999) 14–23.
     versational search: Perspective paper, in: Proceed-     [51] J. Chu-Carroll, M. K. Brown, An evidential model
     ings of the 2018 Conference on Human Information             for tracking initiative in collaborative dialogue in-
     Interaction & Retrieval, CHIIR ’18, ACM, New York,           teractions, User Model. User Adapt. Interact. 8
     NY, USA, 2018, p. 32–41.                                     (1998) 215–254.
[40] H. Zamani, S. Dumais, N. Craswell, P. Bennett,          [52] Z. Wang, J. Lee, S. Marsella, Towards more com-
     G. Lueck, Generating clarifying questions for in-            prehensive listening behavior: Beyond the bobble
     formation retrieval, in: Proceedings of The Web              head, in: Intelligent Virtual Agents, Springer Berlin
     Conference 2020, WWW ’20, ACM, New York, NY,                 Heidelberg, Berlin, Heidelberg, 2011, pp. 216–227.
     USA, 2020, p. 418–428.                                  [53] C. Lee, S. Jung, K. Kim, D. Lee, G. Lee, Recent ap-
[41] M. McTear, Z. Callejas, D. Griol, The Conversa-              proaches to dialog management for spoken dialog
     tional Interface: Talking to Smart Devices, 1st ed.,         systems, J. Comput. Sci. Eng. 4 (2010) 1–22.
     Springer Publishing Company, Incorporated, 2016.        [54] S. Vakulenko, E. Kanoulas, M. de Rijke, An Anal-
[42] H. Bunt, J. Alexandersson, J. Carletta, J.-W. Choe,          ysis of Mixed Initiative and Collaboration in
     A. C. Fang, K. Hasida, K. Lee, V. Petukhova,                 Information-Seeking Dialogues, SIGIR ’20, ACM,
     A. Popescu-Belis, L. Romary, C. Soria, D. Traum,             New York, NY, USA, 2020, p. 2085–2088.
     Towards an ISO standard for dialogue act annota-        [55] N. Tintarev, J. Masthoff, Effective explanations of
     tion, in: Proceedings of the Seventh International           recommendations: User-centered design, in: Pro-
     Conference on Language Resources and Evaluation,             ceedings of the 2007 ACM Conference on Recom-
     LREC ’10, ELRA, Valletta, Malta, 2010.                       mender Systems, RecSys ’07, ACM, New York, NY,
[43] H. Khouzaimi, R. Laroche, F. Lefèvre, Turn-taking            USA, 2007, p. 153–156.
     phenomena in incremental dialogue systems, in:          [56] S. H. Hashemi, J. Kamps, J. Kiseleva, C. L. Clarke,
     Proceedings of the 2015 Conference on Empirical              E. M. Voorhees, Overview of the trec 2016 contex-
     Methods in Natural Language Processing, EMNLP                tual suggestion track., in: TREC, 2016.
     ’15, ACL, Lisbon, Portugal, 2015, pp. 1890–1895.        [57] M. Aliannejadi, F. Crestani, Venue appropriateness
[44] P. N. Bennett, F. Radlinski, R. W. White, E. Yilmaz,         prediction for personalized context-aware venue
     Inferring and using location metadata to personal-           suggestion, in: Proceedings of the 40th Interna-
     ize web search, in: Proceedings of the 34th Inter-           tional ACM SIGIR Conference on Research and De-
     national ACM SIGIR Conference on Research and                velopment in Information Retrieval, SIGIR ’17, As-
     Development in Information Retrieval, SIGIR ’11,             sociation for Computing Machinery, New York, NY,
     ACM, New York, NY, USA, 2011, p. 135–144.                    USA, 2017, p. 1177–1180.
[45] N. Matthijs, F. Radlinski, Personalizing web search     [58] D. Seyler, P. Chandar, M. Davis, An informa-
     using long term browsing history, in: Proceedings            tion retrieval framework for contextual suggestion
     of the Fourth ACM International Conference on                based on heterogeneous information network em-
     Web Search and Data Mining, WSDM ’11, ACM,                   beddings, in: The 41st International ACM SIGIR
     New York, NY, USA, 2011, p. 25–34.                           Conference on Research & Development in Infor-
[46] X. Shen, B. Tan, C. Zhai, Context-sensitive informa-         mation Retrieval, SIGIR ’18, Association for Com-
     tion retrieval using implicit feedback, in: Proceed-         puting Machinery, New York, NY, USA, 2018, p.
     ings of the 28th Annual International ACM SIGIR              953–956.
     Conference on Research and Development in Infor-        [59] J. Mo, L. Lamontagne, R. Khoury, Word embeddings
     mation Retrieval, SIGIR ’05, ACM, New York, NY,              and global preference for contextual suggestion, in:
     USA, 2005, p. 43–50.                                         TREC, 2016.
[47] M. Walker, S. Whittaker, Mixed initiative in dia-       [60] R. McCreadie, C. Buntain, I. Soboroff, Trec incident
     logue: An investigation into discourse segmenta-             streams: Finding actionable information on social
     tion, in: Proceedings of the 28th Annual Meeting             media, in: ISCRAM, 2019.
     of the Association for Computational Linguistics,       [61] A. Kumar, S. Johari, Push notification as a business
     ACL ’90, ACL, Pittsburgh, Pennsylvania, USA, 1990,           enhancement technique for e-commerce, in: Pro-
     pp. 70–78.                                                   ceedings of the Third International Conference on
[48] J. Cassell, Embodied conversational interface                Image Information Processing, ICIIP ’15, 2015, pp.
     agents, Commun. ACM 43 (2000) 70–78.                         450–454.
[49] Z. Wang, J. Lee, S. Marsella, Towards more com-         [62] V. S. Moertini, C. D. Nugroho, E-commerce mobile
     prehensive listening behavior: Beyond the bobble             marketing model resolving users acceptance crite-
     head, Intelligent Virtual Agents (2011) 216–227.             ria, International Journal of Managing Information
[50] J. E. Allen, C. I. Guinn, E. Horvtz, Mixed-initiative        Technology 4 (2012) 23–40.
     interaction, IEEE Intelligent Systems and their Ap-     [63] L. Tan, A. Roegiest, J. Lin, C. L. Clarke, An explo-
     ration of evaluation metrics for mobile push notifi-      [74] R. S. Taylor, The process of asking questions, Amer-
     cations, in: Proceedings of the 39th International             ican Documentation 13 (1962) 391–396.
     ACM SIGIR Conference on Research and Devel-               [75] B. J. Jansen, A. Spink, T. Saracevic, Real life, real
     opment in Information Retrieval, SIGIR ’16, ACM,               users, and real needs: a study and analysis of user
     New York, NY, USA, 2016, p. 741–744.                           queries on the web, Information Processing & Man-
[64] A. Sahami Shirazi, N. Henze, T. Dingler, M. Pielot,            agement 36 (2000) 207 – 227.
     D. Weber, A. Schmidt, Large-scale assessment of           [76] R. Kelly, S. J. Payne, Collaborative web search in
     mobile notifications, in: Proceedings of the SIGCHI            context: A study of tool use in everyday tasks, in:
     Conference on Human Factors in Computing Sys-                  Proceedings of the 17th ACM Conference on Com-
     tems, CHI ’14, ACM, New York, NY, USA, 2014, p.                puter Supported Cooperative Work & Social Com-
     3055–3064.                                                     puting, CSCW ’14, ACM, New York, NY, USA, 2014,
[65] M. Pielot, K. Church, R. de Oliveira, An in-situ study         p. 807–819.
     of mobile phone notifications, in: Proceedings            [77] C. I. Seeking, Collaborative information seeking -
     of the 16th International Conference on Human-                 best practices, new domains and new thoughts |
     Computer Interaction with Mobile Devices & Ser-                preben hansen | springer, 2015.
     vices, MobileHCI ’14, ACM, New York, NY, USA,             [78] C. Shah, R. González-Ibáñez, Exploring information
     2014, p. 233–242.                                              seeking processes in collaborative search tasks, in:
[66] A. Mehrotra, V. Pejovic, J. Vermeulen, R. Hendley,             Proceedings of the 73rd ASIS&T Annual Meeting
     M. Musolesi, My phone and me: Understanding                    on Navigating Streams in an Information Ecosys-
     people’s receptivity to mobile notifications, in: Pro-         tem - Volume 47, ASIS&T ’10, American Society for
     ceedings of the 2016 SIGCHI Conference on Human                Information Science, USA, 2010.
     Factors in Computing Systems, CHI ’16, ACM, New           [79] J. Foster, Understanding interaction in information
     York, NY, USA, 2016, p. 1021–1032.                             seeking and use as a discourse: a dialogic approach,
[67] A. Mehrotra, R. Hendley, M. Musolesi, Prefminer:               J. Documentation 65 (2009) 83–105.
     Mining user’s preferences for intelligent mobile no-
     tification management, in: Proceedings of the 2016
     ACM International Joint Conference on Pervasive
     and Ubiquitous Computing, UbiComp ’16, ACM,
     New York, NY, USA, 2016, p. 1223–1234.
[68] B. Tabuenca, M. Kalz, S. Ternier, M. Specht, Stop
     and think: Exploring mobile notifications to foster
     reflective practice on meta-learning, IEEE Transac-
     tions on Learning Technologies 8 (2015) 124–135.
[69] F. Bentley, K. Tollmar, The Power of Mobile Notifica-
     tions to Increase Wellbeing Logging Behavior, CHI
     ’13, ACM, New York, NY, USA, 2013, p. 1095–1098.
[70] S. M. Gray, Looking for information: A survey of
     research on information seeking, needs, and behav-
     ior, Journal of the Medical Library Association 91
     (2003) 259–260.
[71] A. Stolcke, K. Ries, N. Coccaro, E. Shriberg, R. Bates,
     D. Jurafsky, P. Taylor, R. Martin, C. Van Ess-
     Dykema, M. Meteer, Dialogue act modeling for au-
     tomatic tagging and recognition of conversational
     speech, Computational Linguistics 26 (2000) 339–
     374.
[72] D. Case, Looking for information—a survey of re-
     search on information seeking, needs, and behavior,
     Inf. Res. 8 (2003).
[73] S. Shiga, H. Joho, R. Blanco, J. R. Trippas, M. Sander-
     son, Modelling information needs in collaborative
     search conversations, in: Proceedings of the 40th
     International ACM SIGIR Conference on Research
     and Development in Information Retrieval, SIGIR
     ’17, ACM, New York, NY, USA, 2017, p. 715–724.