=Paper= {{Paper |id=Vol-2935/paper4 |storemode=property |title=Non-humorous Use of Laughter in Spoken Dialogue Systems |pdfUrl=https://ceur-ws.org/Vol-2935/paper4.pdf |volume=Vol-2935 |authors=Vladislav Maraev,Jean-Philippe Bernardy,Christine Howes }} ==Non-humorous Use of Laughter in Spoken Dialogue Systems== https://ceur-ws.org/Vol-2935/paper4.pdf
                          Non-humorous use of laughter in spoken dialogue systems

                     Vladislav Maraev1∗ , Jean-Philippe Bernardy 1 and Christine Howes1
           1
             Centre for Linguistic Theory and Studies in Probability (CLASP), Department of Philosophy,
                             Linguistics and Theory of Science, University of Gothenburg
                          {vladislav.maraev, jean-philippe.bernardy, christine.howes}@gu.se



                             Abstract                                   respectively asking for the phone numbers of certain named
                                                                        businesses. Half of the dialogues happened in a noisy envi-
          In this paper we argue that laughter, an ambigu-              ronment, with many mishearings and laughs induced. This
          ous yet ubiquitous signal in everyday interactions,           paper addresses the following research question: how can
          can act as an important feature for task-oriented             these laughs be accounted for in a dialogue system, which
          dialogue systems. We show which components of                 implements a similar scenario?
          a dialogue system should be affected and modi-
          fied, and more specifically how particular types of            (1) DEC:22_KL_loc2
          laughter can be accounted for in a dialogue man-                    56    Caller er the next one is er tanfield
          ager as instances of short answers, feedbacks and                                 chambers
          vocalisations accompanying them.                                    57 Operator santias?
                                                                              58    Caller tanfield like t- T A N
                                                                              59 Operator sorry i don’t hear you again
1        Introduction                                                                       please?
Laughter is very frequent in everyday interactions, for in-                   60    Caller er T A N
stance, in the Switchboard Dialogue Act Corpus [Jurafsky                      61 Operator C?
et al., 1997] corpus laughter comes about every 200 words.                    62    Caller tanfield
Laughter is an ambiguous social signal, and in addition to                    63 Operator A
communicating joy and pleasure intuitively associated with                    64 Operator N
humour it also can communicate embarrassment, be used to                      65    Caller yeah
smooth and soften everyday interactions and also bear prag-                   66    Caller and then field
matic functions such as marking irony or usage of a word in                   67 Operator and then seal?
a specific sense [Poyatos, 1993; Mazzocconi, 2019; Ginzburg                   68    Caller chambers
et al., 2020].                                                                69 Operator  sorry i hear you quite
   For a spoken dialogue system, laughter is an important sig-                              poorly
nal to account for due to its contribution to the naturalness of              70 Operator let’s try again
automated dialogue. Laughter can be used in chit-chat di-                     71 Operator C?
alogue due to its potential to build rapport and establish a                  72    Caller yeah sorry the traffic is crazy
para-social bond between the user and artificial agent.                                     around here
   There have been attempts to produce laughs as a way to                     73 Operator I know  don’t worry
mimic human behaviour and align with it [Urbain et al.,                       74 Operator so C
2010; El Haddad et al., 2019], as well as laughing avatars                    75 Operator A
mainly focussed on laughter as a reaction to jokes [Ochs and                  76    Caller er
Pelachaud, 2013; Ding et al., 2014]. In this paper we take a                  77    Caller tanfield T like thomas
rather different approach. We start from examples of usage
of laughter in real task-oriented dialogue and then propose             Let’s look at the first laughter (line 69). We can see that the
ways how these behaviours can be reproduced in a dialogue               operator’s question “and then seal?” (l.67) was not addressed
system, and, more specifically, in its dialogue management              and this piece of information was not grounded. “C?” (l.71)
component.                                                              refers to the restart from the beginning (it was “Tanfield”, but
   The example (1) below is an excerpt from a role-play di-             she has heard “C”). The negative feedback provided by the
alogue collected by Howes et al. [2019] for their Directory             operator (l.69) entails extra effort from the caller—she needs
Enquiries Corpus (DEC) [Bondarenko et al., 2020]. Dialogue              to restart her request from the beginning—this obligation is
participants were playing the roles of a caller and an operator,        somewhat intrusive and may require extra smoothing [Maz-
                                                                        zocconi, 2019; Raclaw and Ford, 2017]. For our purposes, we
     ∗
         Contact Author                                                 will treat this laughter as accompanying negative feedback.



    Copyright © 2021 for this paper by its authors. Use
    permitted under Creative Commons License Attribution
    4.0 International (CC BY 4.0).                                 33
   For a dialogue system designer, this poses an empirical              tors), goals and rules. KoS represents language interaction by
question, namely, would it be useful to soften negative feed-           a dynamically changing context. The meaning of an utterance
back with laughter? For instance, the feedback associated               is then how it changes the context. Compared to most ap-
with a local failure (e.g. speech recognition failure), such as         proaches, which represent a single context for both dialogue
“Sorry, I didn’t understand” or “Sorry I didn’t hear you”. It           participants, KoS keeps separate representations for each par-
may also be useful where negative feedback is the result of             ticipant, using the Dialogue Game Board (DGB). Thus, the
an external query, for example, when something is not found             information states of the participants comprise a private part
in the database, and can accompany a system request to start            and the dialogue gameboard that represents information aris-
over, as in example (1).                                                ing from publicised interactions. The DGB tracks, at least,
   The reaction to the apology also can be accompanied by               shared assumptions/visual field, moves (= utterances, form
laughter, as with the second laugh in (1) (l.73). We do not             and content), and questions under discussion.
think that these days users often apologise to a dialogue sys-             In dialogue, especially in a dialogue with a machine which
tem, as it is usually the dialogue system which is at fault, but        involves uncertainty of automatic speech recognition (ASR)
this might be different for special cases of systems that aim at        and natural language understanding components (NLU), we
more naturalistic behaviour.                                            can not assume perfect communication. While communicat-
   In this paper we consider laughter from the utilitarian per-         ing, especially over an unreliable communication channel,
spective and attempt to determine which kinds of laughs can             humans give each other evidence that their contributions are
be relevant for dialogue systems. Next, we will look at laugh-          understood to a certain extent, sufficient for current purposes.
ter from the point of view of providing feedback, either posi-          Clark [1996] and Allwood [1995] distinguish four levels of
tive or negative.                                                       action related to different degrees of grounding. Here we list
   In Section 2 will start with a background on our approach            them according to the action ladder [Clark, 1996], from the
to dialogue, dialogue management and laughter. Next, Sec-               hearer’s perspective.
tion 3 presents a small typology of laughter types that we
                                                                          1. Acceptance level determines whether the content of ut-
think should be accounted for in a task-oriented dialogue sys-
                                                                             terance was accepted or rejected by the hearer.
tem. In Section 4 we describe our own dialogue management
framework and in Section 5 we show a formal account for the               2. Understanding level specifies whether the utterance
aforementioned types of laughter. We conclude with a brief                   was understood by the hearer
discussion of our findings and further laughter-related issues
                                                                          3. Perception level determines whether the utterance was
in Section 6.
                                                                             perceived by the hearer.
2     Background                                                          4. Contact level determines whether interlocutors have es-
                                                                             tablished a channel of communication.
2.1    Dialogue
                                                                           The action ladder assumes that if the level above is com-
A key aspect of dialogue systems is the coherence of the sys-
                                                                        plete, then all levels below are complete. For instance, if
tem’s responses. In this respect, a key component of a dia-
                                                                        Bob asks “Do you like Paris” and Mary replies “Yes”, then
logue system is the dialogue manager, which selects appro-
                                                                        Bob’s utterance is accepted (and also understood, perceived,
priate system actions depending on the current state and the
                                                                        and their contact has been established). If she asks “Paris?”
external context.
                                                                        then it might signal that Bob’s utterance was perceived but
   Two families of approaches to dialogue management can                not understood (and thus not accepted).
be considered: hand-crafted dialogue strategies [Allen et al.,
                                                                           Larsson [2002] accounts for different levels of action
1995; Larsson, 2002; Jokinen, 2009] and statistical modelling
                                                                        within the IBiS2 dialogue management framework using a
of dialogue [Rieser and Lemon, 2011; Young et al., 2010;
                                                                        set of rules to update the common ground represented in the
Williams et al., 2017]. Frameworks for hand-crafted strate-
                                                                        information state of the system. He uses “Interactive Com-
gies range from finite-state machines and form-filling to more
                                                                        munication Management” (ICM) moves [Allwood, 1995] as
complex dialogue planning and logical inference systems,
                                                                        explicit signals concerned with communicating the updates to
such as Information State Update (ISU) [Larsson, 2002] that
                                                                        the common ground, and sequencing moves, e.g. restarting a
we employ here. Although there has been a lot of devel-
                                                                        dialogue.
opment in dialogue systems in recent years, only a few ap-
proaches reflect advancements in dialogue theory. Our aim
is to closely integrate dialogue systems with work in theo-
                                                                        2.2   Laughter
retical semantics and pragmatics of dialogue. In this paper             Our focus of attention towards laughter is motivated by its
we do so by employing our own implementation of the KoS                 ubiquity in natural dialogue. In the British National Corpus,
theoretical dialogue framework [Ginzburg, 2012] which we                laughter is quite a frequent signal regardless of gender and
discussed in [Maraev et al., 2020]. In this work we extend              age—the spoken dialogue part of the British National Cor-
our implementation with rudimentary support of grounding,               pus (UK English, unscripted interactions that were recorded
therefore allowing the implementation to be further extended            by volunteers in various social settings, balanced for age, re-
to support certain types of laughter.                                   gion and social class) contains approximately one occurrence
   In KoS (and many other dynamic approaches to meaning),               of laughter every 14 utterances. In the Switchboard Dialogue
language is treated as a game, containing players (interlocu-           Act corpus [Jurafsky et al., 1997] (US English, one-on-one




                                                                   34
interactions over a phone where participants that are not fa-           3     Types of laughter
miliar with each other discuss a potentially controversial sub-         In this section we outline some types of laughter that can be
ject, such as gun control or school system) non-verbally vo-            of special interest to task-oriented dialogue systems and can
calised dialogue acts (whole utterances that are marked as              be accounted for within our proposed framework.
non-verbal) constitute 1.7% of all dialogue acts and 65% of
them contain laughter. Laughter tokens make up 0.5% of all              3.1    Laughter as a component of grounding
the tokens that occur in Switchboard Dialogue Act corpus.               As we have mentioned in Section 2, and in accord with All-
                                                                        wood [1995]; Clark [1996]; Larsson [2002] we consider four
   Laughter production in conversation is not exclusively re-           action levels that are involved in a dialogue. Here we discuss
lated to humour. But, perhaps unsurprisingly, the study of              what can happen at each level of action—contact, perception,
laughter has often been linked to the study of humour and               understanding and reaction—with respect to laughter.
the two terms are frequently used interchangeably. However,             Contact and perception levels
laughter does not occur only in response to humour or in order          Troubles related to establishing and maintaining a stable com-
to frame it. Many studies, particularly in conversation analy-          munication channel can lead to laughter. One such example
sis, have shown its crucial role in managing conversations at           would be delays in communication, for instance over an unre-
several levels: dynamics (turn-taking and topic-change), lex-           liable network, which might lead to a person already speaking
ical (signalling problems of lexical retrieval or imprecision in        at the moment when the communication is only supposed to
the lexical choice), pragmatic (marking irony, disambiguating           be established. Obvious examples of such cases are caused
meaning, managing self-correction) and social (smoothing                by signal jitter over video conference platforms like Zoom.
and softening difficult situations or showing (dis)affiliation)            The lack of perception indicates things that haven’t been
[Glenn, 2003; Jefferson, 1984; Mazzocconi, 2019; Petitjean              heard correctly (cases similar to (1)). Also, it seems that in-
and González-Martínez, 2015]                                            terruptions or events related to that can be quite surprising and
                                                                        laughter can be a natural reaction to a surprise (see Section 6).
   There have been several approaches to classify types of
laughter [e.g., Poyatos, 1993; Vettin and Todt, 2004; Mazzoc-           Understanding level
coni, 2019]. Mazzocconi [2019] claims that the most prob-               The lack of pragmatic understanding relates to the kinds of
lematic issue with existing taxonomies is that they mix types           incongruities that are caused by the violation of the principle
of laughter functions with types of laughter triggers, so she           of conversational relevance. This is very useful for dialogue
roots her proposal on the function of laughter and the proposi-         systems because they are prone to errors in this realm. It is
tional content of laughable—the argument the laughter pred-             often the case that incorrect NLU or ASR can lead to priori-
icates about, an event or state referred to by an utterance or          tising irrelevant results (for example, in cases of out-of-scope
exophorically [Glenn, 2003]. In this paper we look at laugh-            user queries), which can cause user’s confusion and, there-
ter not exclusively from a perspective of a taxonomy that can           fore, laughter. This type of laughter can be treated as negative
be used as a theoretical framework but from the utilitarian             feedback.
perspective, looking at which kinds of laughs can be relevant               This accounts for the examples (2) and (3) below. [Lars-
for dialogue systems.                                                   son, 2002] subdivides this level into three categories for the
                                                                        negative feedback (context-dependent, context-independent
                                                                        and pragmatic). The examples (2) and (3) above would re-
   Laughter as a way for an embodied conversational agent               late to the pragmatic level of misunderstanding.
(ECA) to provide emotional response has gained some atten-
tion from the Affective Computing and other research com-                 (2) from the dialogue between a virtual assistant (Diana)
munities. Becker-Asano and Ishiguro [2009] evaluated the                      and a person with ASD (Mark):
role of laughter in the perception of social robots and indi-                   Mark         Diana, what is money?
cated that the situational context, determined by linguistic and                Diana        I am Diana, a virtual interlocutor.
non-verbal cues (such as gaze) played an important role. Ni-                    Audience (laugh)
jholt [2002] discusses the challenges of integrating humour               (3) constructed example
into ECAs, and existing integration of smiling and laughter                     Brian Would you like tea or coffee?
in embodied conversational agents (ECA) is typically is trig-                   Katie yes
gered by a joke told by a user or an agent [Ding et al., 2014;                  Brian (laughs)
Ochs and Pelachaud, 2013]. El Haddad et al. [2019] looked at                A dialogue system can also be unsure about what has been
the mimicry of smiles and laughs between the interlocutors,             understood. In such cases, the system should demonstrate
which also might be used as the basis for ECA’s behaviour.              a lower degree of commitment to what has been said as a
Urbain et al. [2010] takes a similar perspective, equipping             part of a display of understanding. For example, in the case
ECAs with a capability to join its conversational partner’s             of the feedback regarding the user input, when the system
laugh. In this work we take a contrasting approach, look-               repeats the input after the user, it can be useful to include
ing at pragmatic functions of some types of laughter, namely            laughter in verbatim repeats, which would mean: yes, I heard
providing feedback and answering questions, and provide a               (understood) this, but I might be wrong. This can also be
formal account for such behaviour within a dialogue man-                useful for a system’s actions taken based on low confidence
agement framework.                                                      results.




                                                                   35
Reaction (consider for acceptance) level                                                                     within     Statement-non-opinion
                                                                                                      the given DA                   Apology
                                                                                                                                     Downplayer
On this level what has being understood can be either ac-
cepted or rejected for the current purpose. Acceptance laugh-
ter can typically be related to a reaction to humour, which is             in previous utterance                                         in next utterance
                                                                                                                                              by self
out of the scope of the current paper, or apology (see next                        by self
section).                                                                                                                                         0.16
                                                                                                                                        0.12
   Ginzburg et al. [2020] consider some uses of standalone                                                                    0.08
                                                                                                                       0.04
laughter as cases of negative response to a polar question (4)
or a signal of disbelief in a previously uttered assertion (5).

 (4) From Ginzburg et al. [2020], context: Bayern München
     goalkeeper Manuel Neuer faces the press after his                        in previous utterance                             in next utterance
     team’s (Dreierkette—three-in-the-back) defence has                              by other                                        by other
     proved highly problematic in the game just played
     (which they won 3-2 against Paderborn).
       Journalist: (smile) Dreierkette auch ‘ne Option?                 Figure 1: Comparison of the most common dialogue act in SWDA—
                             (Is the three-at-the-back also             “Statement-Non-Opinion” (33.27% of all utterances) with the di-
                                                                        alogue acts “Apology” (0.04%) and “Downplayer” (0.05%). The
                             an option?)                                proportion of utterances that contain laughter are shown in associa-
       Manuel Neuer:         fuh fuh fuh                                tion with each dialogue act.
                             (brief laugh)
 (5) From Ginzburg et al. [2020] (biblical example
     rephrased as a dialogue)
                                                                                162        Operator          still not finding it
       God:          You will at age 99 with your aged wife
                                                                                163        Operator          having problems with this one
                     Sarah have a son.
                                                                                164          Caller          okay
       Abraham: (laughs)
                                                                                165          Caller          er maybe i can find
     → I don’t think I will at age 99 have a son
                                                                                166          Caller          er the place myself but thank you
                                                                                                             very much for the information
   In Section 5 we show how this kind of laughter as negative                   167        Operator          no problem sorry for not finding
response like (4) can be handled by the dialogue manager.                                                    the the last one
                                                                                168            Caller        
                                                                                169            Caller        no worries
3.2   Laughter and intrusion                                                    170            Caller        thank you

In natural dialogue, an intrusion is frequently associated with            We also observe that laughter can clearly accompany the
laughter. In the Switchboard Dialogue Act corpus (SWDA)                 asking for a favour by the same speaker. In example (7) the
[Jurafsky et al., 1997] an Apology dialogue act is more re-             operator asks the caller if they can start from the beginning,
lated to laughter, as compared to other dialogue acts. In               which can be treated as an intrusion of some sort, therefore
Figure 1 we show how many dialogue acts are associated                  asking for a favour and the apology is accompanied by laugh-
with utterances1 containing laughter, for the current dia-              ter.
logue act and for preceding and following utterances, de-
pending on the speaker. In addition to an apology, we show
its adjacency counterpart (second element of the utterance               (7) DEC:24_LK_loc2
pair produced by the other speaker [Schegloff and Sacks,                      59    Caller B as in bicycle
1973])—Downplayer—realised, for instance, by utterances                       60 Operator yeah
like “Don’t worry” or “It’s alright”.                                         61    Caller then you have R
                                                                              62    Caller I
   In (6), the caller reacts with compassionate laughter to the               63 Operator R
apology given by the operator. This is a similar instance of                  64    Caller G
laughter to one seen in (1): the second laugh shows that the                  65 Operator I
same reaction, as in (6) can be expected from the operator.                   66 Operator okay sorry no- now i lost the track
                                                                                            okay can we it start from the
 (6) DEC:16_HG_loc2                                                                         beginning  sorry
                                                                              67    Caller okay
                                                                              68    Caller yes we can
                                                                              69 Operator maybe you can just say the uh say
   1
     In SWDA each utterance is typically mapped to a single dia-                            words
logue act.                                                                    70    Caller yeah no no problem




                                                                   36
4     Dialogue manager architecture                                        they are linear, these hypotheses can also be removed from
We believe that it is crucial to use formal tools which are most           the state. In particular, we have a fixed set of rules (they re-
appropriate for the task: one should be able to express the                main available even after being used). Each such rule ma-
rules of various genres of dialogue in a concise way, free,                nipulates a part of the information state (captured by its pre-
to any possible extent, of irrelevant technical details. In the            misses) and leaves everything else in the state alone.
view of Dixon et al. [2009] this is best done by represent-                   Our dialogue manager (DM) models the information-state
ing the information-state of the agents as updatable sets of               of only one participant. Regardless, this participant can
propositions. Very often, dialogue-management rules update                 record its own beliefs about the state of other participants.
subsets (propositions) of the information state independently              In general, the core of the DM is comprised of a set of linear-
from the rest. A suitable and flexible way to represent such               logic rules which depend on the domain of application. How-
updates is as function types in linear logic. The domain of                ever, many rules will be domain-independent (such as generic
the function is the subset of propositions to update, and the              processing of answers). We show examples of such rules in
co-domain is the (new) set of propositions which it replaces.              Section 4.4.
   By using well-known techniques which correspond well
with the intuition of information-state based dialogue man-
                                                                           4.2   Questions and answers
agement, we are able to provide a fully working prototype of               In this paper, the essential components of the representation
the components of our framework:                                           of a question are a type A, and a predicate P over A. Using a
                                                                           typed intuitionistic logic, we write:
    1. a proof-search engine based on linear logic, modified                       A : Type                  P : A → Prop
       to support inputs from external systems (representing
                                                                              The intent of the question is to find out about a value x
       inputs and outputs of the agent)
                                                                           of type A which makes P x true, or at least entertained by
    2. a set of rules which function as a core framework for               the other participant. We provide several examples in Table
       dialogue management (in the style of KoS [Ginzburg,                 1. It is worth stressing that the type A can be large (for ex-
       2012])                                                              ample asking for any location) or as small as a boolean (if
    3. several examples which use the above to construct po-               one requires a simple yes/no answer). We note in passing
       tential applications of the system.                                 that, typically, polar questions can be answered not just by a
                                                                           boolean but by qualifying the predicate in question, for exam-
4.1    Linear rules and proof search                                       ple, “maybe”, “on Tuesdays”, etc. (Table 1, last two rows).
                                                                           This is formalised by letting A = Prop → Prop.
Typically, and in particular in the archetypal logic program-
ming language prolog [Bratko, 2001], axioms and rules are                  4.3   Representation of questions with
expressed within the general framework of first-order logic.                     metavariables
However, several authors [Dixon et al., 2009; Martens, 2015]
have proposed using linear logic [Girard, 1995] instead. For               In this subsection we show how a metavariable can represent
our purpose, the crucial feature of linear logic is that hypothe-          what is being asked, as the unknown in a proposition. A first
ses may be used only once.                                                 use for metavariables is to represent the requested answer to
   In general, the linear arrow corresponds to destructive state           a question.
updates. Thus, the hypotheses available for proof search cor-                 Within the state of the agent, if the value of the requested
respond to the state of the system. In our application, they               answer is represented as a metavariable x , then the question
will correspond to the information state of the dialogue par-              can be represented as: Q A x (P x ). That is, the pending
ticipant.                                                                  question (Q denotes a question constructor) is a triple of a
   In linear logic, normally firing a linear rule corresponds to           type, a metavariable x , and a proposition where x occurs. We
triggering an action of an agent, and a complete proof cor-                stress that P x is not part of the information state of the agent
responds to a scenario, i.e. a sequence of actions, possibly               yet, rather the fact that the above question is under discussion
involving action from several agents. However, the informa-                is a fact. For example, after asking “Where does John live?”,
tion state (typically in the literature and in this paper as well),        we have:
corresponds to the state of a single agent. Thus, a scenario
is conceived as a sequence of actions and updates of the in-                     haveQud : QUD (Q Location x (Live John x ))
formation state of a single agent a, even though such actions                 Resolving a question can be done by communicating an
can be attributed to any other dialogue participant b. (That is,           answer. An answer to a question (A : Type; P : A → Prop)
they are a’s representation of actions of b.) Scenarios can be             can be of either of the two following forms: i) A ShortAn-
realised as a sequence of actual actions and updates. That is,             swer, which is a pair of an element X : A and its type A, rep-
an action can result in sending a message to the outside world             resented as ShortAnswer A X or ii) An Assertion which is
(in the form of speech, movement, etc.). Conversely, events                a proposition R : Prop, represented as Assert R. Therefore,
happening in the outside world can result in extra-logical up-             one way to process a short answer is by the processShort
dates of the information state (through a model of the percep-             rule:
tory subsystem).
   In our implementation, we treat the information state as a                    processShort : (a : Type) → (x : a) → (p : Prop) →
multiset of linear hypotheses that can be queried. Because                         ShortAnswer a x ( QUD (Q a x p) ( p




                                                                      37
 question             A                   P                                                reply      x

  Where does
                      Location            λx .Live John x                            in London        ShortAnswer Location London
  John live?
  Does John                               λx .if x then (Live John Paris)
                      Bool                                                           yes              ShortAnswer Bool True
  live in Paris?                          else Not (Live John Paris)

  What time is it?    Time                λx .IsTime x                               It is 5am.       Assert (IsTime 5.00)

  Does John                                                                                           ShortAnswer (Prop → Prop)
                      Prop → Prop         λm.m (Live John Paris)                     yes
  live in Paris?                                                                                      (λx .x )
  Does John                                                                                           ShortAnswer (Prop → Prop)
                      Prop → Prop         λm.m (Live John Paris)                     from January
  live in Paris?                                                                                      (λx .FromJanuary (x ))

Table 1: Examples of questions and the possible corresponding answers. The type A is the type of possible short answers. The proposition
P x is the interpretation of a short answer x . The x column shows the formal representation of a possible answer, either in short form or
assertion form.


Above we use Π type binders to declare (meta)variables                    participant. Regardless, this participant can record its own be-
(written here (a : Type) →, (x : a) →, etc.). This termi-                 liefs about the state of other participants. In general, the core
nology will make sense to readers familiar with dependent                 of the DM is comprised of a set of linear-logic rules which
types. For others, such binders can be thought of as universal            depend on the domain of application. However, many rules
quantification (∀a, ∀x , etc.), the difference is that the type of        will be domain-independent (such as the generic processing
the bound variable is specified.2                                         of answers).
   We demand in particular that types in the answer and in                   To be useful, a DM must interact with the outside world,
the question match (a occurs in both places). Additionally,               and this interaction cannot be represented using logical rules,
because x occurs in p, the information state will mention the             which can only manipulate data which is already integrated in
concrete x which was provided in the answer. For example,                 the information state. Here, we assume that the information
if the QUD was (Q Location x (Live John x )) and the                      that comes from sources which are external to the dialogue
system processes the answer ShortAnswer Location Paris,                   manager is expressed in terms of semantic interpretations of
then x unifies with Paris, and the new state will include                 moves, and contains information about the speaker and the
Live John Paris.                                                          addressee in a structured way. We provide 5 basic types of
   To process assertions, we can use the following rule:                  moves, specified with a speaker and an addressee, as an illus-
                                                                          tration:
       processAssert : (a : Type) → (x : a) → (p : Prop) →
         Assert p ( QUD (Q a x p) ( p                                          Greet        spkr addr
                                                                               CounterGreet spkr addr
That is, if (1) p was asserted, and (2) the proposition q is                   Ask          question spkr addr
part of a question under discussion, and (3) p can be unified                  ShortAnswer vtype v spkr addr
with q (we ensure this unification by simply using the same                    Assert       p spkr addr
metavariable p in both roles in the above rule), then the asser-
tion resolves the question. Additionally, the metavariable x is               These moves can either be received as input or produced as
made ground to a value provided by p, by virtue of unification            outputs. If they are inputs, they come from the NLU compo-
of p and q. For example, “John lives in Paris” answers both of            nent, and they enter the context with Heard : Move → Prop
the questions “Where does John live?” and “Does John live                 predicate. For example, if one hears a greeting, the propo-
in Paris?” (there is unification), but, not, for example, “What           sition Heard (Greet S A) is added to the information
time is it?” (there is no unification). Note that, in both cases          state/context, without any rule being fired—this is what we
(processAssert and processShort), the information state is                mean by an external source.
updated with the proposition posed in the question.                           If they are outputs, to be further used by the NLG com-
                                                                          ponent, some rule will place them in Agenda. For example,
4.4    Dialogue management                                                to issue a counter greeting, a rule will place the proposition
In this section we integrate our question/answering frame-                (CounterGreet A S ) in the Cons-list Agenda part of the
work within more complete dialogue manager (DM). We                       information state.
stress that this DM models the information-state of only one                  Thereby each move is accompanied by the information
                                                                          about who has uttered it, and towards whom was it addressed.
   2
    The reader worried about any theoretical difficulty regarding         All the moves are recorded in the Moves part of the partici-
mixing linear and dependent types is directed to Atkey [2018] and         pant’s dialogue gameboard, as a Cons-list (stack).
Abel and Bernardy [2020].                                                     Additionally, we record any move m which one has yet to




                                                                     38
actively react to, in a hypothesis of the form Pending m. We                on whether the fact is unique and concrete or not (defined by
cannot use the Moves part of the state for this purpose, be-                operators →! and →? respectively, see Maraev et al., 2020
cause it is meant to be static (not to be consumed). Pending                for further details).
thus allows one to make the difference between a move which
is fully processed and a pending one.                                             produceAnswer :
    Here we will provide a few examples of the rules which                          (a : Type) → (x : a) →! (p : Prop) →
are implemented in our system, and we refer our reader to                           (qs : List Question) →
[Maraev et al., 2020] for more detailed description.                                QUD (Cons (Q USER a x p) qs) ( p _
                                                                                    [_ :: Agenda (ShortAnswer a x SYSTEM USER);
Examples                                                                             _ :: QUD qs;
We can show how basic move-adjacency can be defined in the                           _ :: Answered (Q USER a x p)]
example of a counter greeting preconditioned by a greeting
from the other party:3
                                                                            4.5   Extending the dialogue manager with
       counterGreeting : (x y : DP ) → HasTurn x _                                grounding strategies
         Agenda as ( Pending (Greet y x ) (                                 In this subsection we provide a sketch of basic grounding
         Agenda (Cons (CounterGreet x y) as)                                strategies and moves related to them, which will be further
   Another important rule accounts for pushing the content of               used to model laughter.
any received Ask move on top of the stack of questions under                   Dialogue systems deal with confidence scores from ASR
discussion (QUD).                                                           and NLU components, which reflects the uncertainty in user
                                                                            queries. For simplicity we will represent the confidence
       pushQUD : (q : Question) → (qs : List Question) →                    score t in on the basis of three confidence threshold lev-
                 (x y : DP ) → Pending (Ask q x y) (                        els (T1 < T2 ), where RED would correspond to t < T1 ,
                 QUD qs ( QUD (Cons q qs)                                   YELLOW to T1 < t < T2 , and GREEN to T2 < t. Colour-
   If the user asserts something that relates to the top QUD,               coded confidence scores would accompany user moves, e.g.
then the QUD can be resolved and therefore removed from                     the Ask move such as “What time is it?” can be represented
the stack. The corresponding proposition p is saved as a                    as follows:
PendingUserFact.4 The following rule5 is an extended di-                          Ask (Q U Time t0 (IsTime t0 )) U S YELLOW
alogue management version of the rule previously introduced
in Section 4.3.                                                                Here we illustrate the possibility of extending the system
                                                                            with Interactive Communication Management (ICM) moves
       processAssert : (a : Type) → (x : a) → (p : Prop) →
                                                                            and grounding strategies, replicating Larsson’s [2002] ac-
         (qs : List Question) →
                                                                            count for grounding and feedback. ICM moves are used for
         (dp dp1 : DP ) → Pending (Assert p dp1 dp) (
                                                                            coordination of the common ground in dialogue, which ex-
         QUD (Cons (Q dp a x p) qs) (
                                                                            presses, for instance, explicit signals for integrating the in-
         [_ :: PendingUserFact p; _ :: QUD qs ]
                                                                            coming information and updating the common ground (dia-
   Then, other rules will take into account the                             logue gameboard in our implementation). The basic type for
PendingUserFact p in a system-specific way. In the                          the ICM move is the following:
simplest case, the system may treat p as a true proposition.
(In this paper we will consider meta-level pending user facts                     ICM level polarity content
instead.)                                                                   where level corresponds to the level of grounding (contact,
   Short answers are processed in a very similar way to asser-              perception, understanding, acceptance), polarity is either
tions:                                                                      positive or negative, and the optional value content corre-
       processShort : (a : Type) → (x : a) → (p : Prop) →                   sponds to a component of the common ground in question.
         (qs : List Question) → (dp dp1 : DP ) →                            For instance, the move (ICM Per Neg None) would corre-
         Pending (ShortAnswer a x dp1 dp) (                                 spond to the utterance “I didn’t understand what you said” or
         QUD (Cons (Q dp a x p) qs) (                                       “Pardon”, and the move (ICM Und Pos q) can be realised
         [_ :: PendingUserFact p; _ :: QUD qs ]                             as the utterance “You are asking me what time is it” if the
                                                                            QUD q corresponds to the question from Ask move exempli-
  If the system has a fact p in its database it can produce an              fied above.
answer or a domain-specific clarification request depending                    Next, we modify our basic pushQUD rule defined in Sec-
   3                                                                        tion 4.4 to support different system behaviours depending on
      Taking a linear argument and producing it again is a common
pattern, which can be spelled out A ( (A ⊗ P ). From here on we             the confidence score. In the GREEN case, question from
use the syntactic sugar A _ P for it.                                       the user Ask move is being integrated into QUD, and ICM
    4
      For the current purposes we only remove the top QUD, but in a         move displaying positive acceptance feedback, i.e. “okay”,
more general case we can implement the policy that can potentially          (ICM Acc Pos None) is being put on the Agenda. In
resolve any QUD from the stack.                                             the YELLOW case, system should additionally report about
    5                                                                       positive understanding, e.g. “You want to know about time”,
      Note the use of the single colon (:) for metavariables and the
double colon for information-state hypotheses (::).                         so it adds (ICM Und Pos q) move on the Agenda.




                                                                       39
     pushQUDGreen : (q : Question) →                                   queries with more arguments can be resolved in shorter ut-
       (qs : List Question) → (x y : DP ) →                            terance depending on the arguments that are made ground.
       Pending (Ask q x y GREEN ) ( Agenda as (                        For instance, in a context of interaction at a food kiosk:
       QUD qs (
         [_ :: QUD (Cons q qs);                                               ICM Und Pos
          _ :: Agenda (Cons (ICM Acc Pos None) as); ]                           (QuestionIsNot
                                                                                  (Q U (Prop → Prop) m0 (m0 WantOlives))
     pushQUDYellow : (q : Question) →                                  could become a simple “Sorry, let’s forget olives.”.
       (qs : List Question) → (x y : DP ) →
       Pending (Ask q x y YELLOW ) ( Agenda as (                       5      Formal treatment of certain types of
       QUD qs (
         [_ :: QUD (Cons q qs);                                               laughter
          _ :: Agenda (Cons (ICM Acc Pos None)                         5.1     Laughter as a rejection signal
          (Cons (ICM Und Pos q) as)); ]                                Laughter as a reaction to interrogative feedback in the case
    For RED confidence score, the system issues an interroga-          of low confidence ASR/NLU result can be illustrated by the
tive ICM query, such as “I understood you’re asking me about           following dialogue.
the time, is that correct?”. In this case a special type of QUD                U:   I would like to        Ask q
is introduced, namely a question about whether question q is                        order a vegan bean
correctly understood.                                                               burger.
     icmINTConfirm : (q : Question) → (x y : DP ) →                            S:   I understood you’d     ICM Und Int q
                                                                        (8)
       Pending (Ask q x y RED) ( Agenda as (                                        like to order a beef
       QUD qs (                                                                     burger. Is that
       [_ :: QUD (Cons (Q Bool x                                                    correct?
           (if x then UND q                                                    U:   HAHAHA                 ShortAnswer Bool False
             else UNDN q)) qs);                                           Here we can treat laughter as a short negative answer, sim-
        _ :: Agenda (Cons (ICM Und Int q) as)]                         ilar to “No”. In the case of interrogative ICM move, such an
   Processing answers related to such a type of QUD will be            answer can be processed using the icmINTneg rule defined
done as usual. For instance, a short “yes” or “no” will be             above.
treated here as a boolean, and depending on the answer the                This can be treated as a recovery strategy for different sys-
context will contain either PendingUserFact (UND q) or                 tem outputs not desired by dialogue system designers. This
PendingUserFact (UNDN q).                                              approach can be extended to other cases of user feedback,
   In this sketch implementation, we do not care about confi-          for instance, to cover the cases with higher confidence score
dence scores for these answers, leaving it underspecified, but         where the system produces ICM Und Pos q move, but this
further, more specific dialogue rules are possible.                    is out of the scope of the current paper.
   Regardless of the particular answer, once the ICM question             Returning to the more sophisticated (4), it can be handled
is answered, it is removed from the QUD stack, so that to of           by our generic rules for integrating QUDs (pushQUD). For
the QUD stack is restored to the originally asked question.            that we need to consider polar questions as expecting an an-
In our system, this is taken care of by the generic handling of        swer of Prop → Prop type (see Table 1). Recalling the ex-
ShortAnswer s. Thus, in the case of a positive answer to such          ample:
a query, there is nothing particular to do.                                    Journalist: (smile) Dreierkette auch ‘ne Option?
   In the negative case, the ICM move about the understand-                                          (Is the three-in-the-back also
ing that the question was not q is issued.                               (4)                         an option?)
     icmINTneg : (q : Question) → (x y : DP ) →                                Manuel Neuer:         fuh fuh fuh
       (c : Confidence) →                                                                            (brief laugh)
       PendingUserFact (UNDN q) (                                      and a type for question:
       Agenda as (                                                              A : Type                  P : A → Prop
       Agenda (Cons                                                       In this case,
          (ICM Und Pos (QuestionIsNot q)) as)
                                                                              A = Prop → Prop
   How ICM moves are converted to natural language ut-                        P = λm.m IsOptionDreierkette
terances, depending on q, is a natural language generation
(NLG) issue. For instance,                                                 The brief laughter by Manuel Neuer can be represented as:
     ICM Und Pos                                                               J fuhfuhfuhK = ShortAnswer
       (QuestionIsNot                                                            (Prop → Prop) (λx .Laughable x )
         (Q U Time t0 (IsTime t0 )))
                                                                       where the modification of the proposition, resulting in
can become the (rather tedious) utterance “So, you are not             (Laughable IsOptionDreierkette) has a very basic mean-
asking me what time it is”, whereas more sophisticated                 ing: this proposition is the laughable, without being more




                                                                  40
specific about the laughter function. One can also consider           In (9) the caller experiences issues with coming up with pho-
being more specific, simply treating laughter as a negation           netic spellings for certain words. The first laugh (line 27)
(ShortAnswer (Prop → Prop) (λx .Not x )), but in general              deserves attention, as it seems that it reflects on both pleas-
laughter has a more nuanced meaning.                                  ant incongruity and social one (smoothing), according to the
                                                                      taxonomy of [Mazzocconi, 2019]. The pleasant incongruity
5.2    Laughter which accompanies feedback                            is due to the fact that the phonetic spelling of “U” as in “un-
Laughter can act as a part of ICM moves’ realisation per-             der” is incongruous with the preceding ones: a preposition
formed by natural language generation (NLG) component. It             vs. proper nouns. The way to spell things phonetically is
seems to us that, in particular, ICM moves the use of laugh-          typically culturally specific, with the most typical cases of
ter can be considered “safe”. For instance, ICM move of the           cities or countries. Stereotypes and conversational conven-
form (ICM Und Pos (QuestionIsNot (Q U (Prop →                         tions can be expressed with the formal notions of enthymemes
Prop) m0 (m0 WantOlives))) can be realised as a natu-                 and topoi, following the work of Breitholtz [2020] on rea-
ral language utterance like “Okay, let’s forget olives, hehe”,        soning in conversation. Breitholtz and Maraev [2019] used
whereas laughter is used as a smoothing device to mitigate            these notions to analyse conversational humour as well as
the awkwardness of system failure. Larsson [2002] often               canned jokes, and we find it potentially helpful to be inte-
included an apology “Sorry” in some of the ICM moves,                 grated into our framework in order to account for humour in
e.g. “Sorry, I didn’t understand that”. With some possible            dialogue systems. Dybala et al. [2010] emphasises the impor-
caveats, we can sometimes include slight laughter in such             tance of the “two-stage” approach to humour in dialogue sys-
moves, especially if a system is getting a bit repetitive and         tems, where the system tracks the emotional state of the user,
produces (ICM Und Neg) too often. Considering the evi-                produces humour as a reaction to certain states and analyses
dence for laughter often accompanying apology (as a separate          user’s further emotional reaction.
dialogue act) presented in Section 3.2, this can mimic natural        6.2   Surprise
behaviour in dialogue.
                                                                      Intuitively, laughter is related to events that are not expected
                                                                      in interaction. One of the ways to establish some degree of
6     Discussion and future work                                      natural behaviour for a dialogue system would be to react sin-
In this paper we have shown how some types of laughter can            cerely to these kinds of surprising events. A possible measure
be accounted for in task-oriented spoken dialogue system. We          for a system’s surprisal is how confused it is by the user in-
proposed our own proof-theoretic architecture of a dialogue           put. A natural measure for this from information theory is
manager based on KoS framework and extended it with some              perplexity, a probability-based metric. For N words in an
grounding strategies. Based on this, we have shown how cer-           evaluation set W = w1 w2 . . . wN , the average perplexity per
tain types of laughter, can be processed within the dialogue          word is computed as follows:
manager and natural language generator, namely: laughter as                                    v
a negative feedback, laughter as a negative answer to a po-                                    uN
                                                                                               uY               1
                                                                                               N
lar question and laughter as a signal accompanying system                         P P (W ) = t                                     (1)
feedback.                                                                                         i=1
                                                                                                      P (wi | w1 . . . wi−1 )
   In the following subsections we discuss several issues re-            Given a language model, we can employ a threshold de-
lated to laughter in spoken dialogue systems, but only merely         fined by perplexity which the system can use to act as being
touching the main subject of the paper.                               surprised, e.g. by saying “Ha-ha, I did not expect this!”
                                                                         Similarly, perplexity can be inferred from tracking a dia-
6.1    Humour                                                         logue state in a Dialogue State Tracking task [Mrkšić et al.,
We start with humour, which is usually considered in relation         2017], which is a common task in statistical approaches to di-
to jokes generated by dialogue system, but here we present            alogue system. Or, following Noble and Maraev [2021], the
more subtle incongruities related to humour in task-oriented          RNN trained on a large dialogue corpus as a representation of
dialogue.                                                             dialogue context can be used to calculate perplexity.
                                                                         Laughter as a reaction of surprise can relate to the levels
 (9) DEC:28_NM_loc2
      17    Caller okay so it starts with a                           of feedback, for example, a user surprised by a pragmatically
      18    Caller L                                                  incoherent system’s reply can laugh (Section 5.1). But here
      19 Operator L?                                                  surprise is taken in isolation, as a measure on its own right.
      20    Caller as in london                                       6.3   Awkwardness and time-saving
      21 Operator yes
      22    Caller A as in america                                    In (9), “under” is produced after a long pause (l.25) and
      23 Operator america                                             therefore indicates awkwardness in producing the phonetic
      24    Caller er U                                               spelling made the operator wait—therefore making the situ-
      25    Caller as in er ((pause: 1.2s))                           ation uncomfortable to the caller, so laughter was used for
      26    Caller er under                                           smoothing it.
      27    Caller                                                In the follow-up excerpt (10) from the same dialogue,
      28 Operator under yes                                           user’s awkwardness continues and she accompanies it with




                                                                 41
laughter. Firstly, she laughs (l.139) demonstrating that she            goes for system’s laughter as an appropriate reaction to con-
has given up finding any phonetic spelling for “K”, releasing           versational humour.
the turn and allowing the operator to carry on. Her second                 Another portion of the features can be evaluated only sub-
laugh smooths her slight embarrassment after the situation              jectively, for example, it is a question of user preference
was resolved by the operator.                                           whether it is okay for a system to accompany asking for a
                                                                        favour (e.g. “Let’s start over!”) with laughter. For this pur-
(10) DEC:28_NM_loc2                                                     pose, we can employ subjective evaluation methods such as
      134    Caller O for oslo
                                                                        more task-oriented SASSI [Hone and Graham, 2000] or the
      135 Operator O for oslo
                                                                        more chatterbot-oriented methodology proposed by Dybala
      136    Caller again O for oslo
                                                                        et al. [2009], which was used for humour-equipped chatbots.
      137 Operator O for oslo
                                                                        We optimistically expect that characteristics such as natural-
      138    Caller and K for er ((pause: 1.6s))
                                                                        ity and likeability would increase and annoyance would de-
      139    Caller 
                                                                        crease.
      140 Operator as in king?
      141    Caller k- king  yeah
      142 Operator yes                                                  Acknowledgments
      143    Caller thank you                                           The research reported in this paper was supported by grant
      144 Operator that’s it?                                           2014-39 from the Swedish Research Council, which funds
      145    Caller that’s it                                           the Centre for Linguistic Theory and Studies in Probabil-
                                                                        ity (CLASP) in the Department of Philosophy, Linguistics,
   We can hypothesise that in a dialogue system these exam-             and Theory of Science at the University of Gothenburg. In
ples can be handled as follows. For a system, there are op-             addition, we would like to thank Staffan Larsson, Jonathan
erations which the developer knows are going to take time               Ginzburg and our anonymous reviewers for their useful com-
due to technical constraints, but are expected to be immedi-            ments.
ate by the user. In this case, a system can produce a similar
behaviour to the one in (9) (l.25–27): “er. . . (pause) [comes
up with an answer] ”. A system can detect the pat-               References
terns of filled pause +  from the user and treat them            Andreas Abel and Jean-Philippe Bernardy. A unified view of
as turn-release cues. It can be a signal of either that there is          modalities in type systems. Proceedings of the ACM on
something that confused the user, or that she genuinely could             Programming Languages, 4(ICFP), 2020.
not come up with an answer due to certain difficulties. The             James F Allen, Lenhart K Schubert, George Ferguson, Peter
downplayer dialogue act (e.g. “don’t worry”) or laughter in               Heeman, Chung Hee Hwang, Tsuneaki Kato, Marc Light,
response also can be appropriate as system feedback in such               Nathaniel Martin, Bradford Miller, Massimo Poesio, et al.
a situation. We consider these ideas as a subject for further             The TRAINS project: A case study in building a conver-
empirical investigations.                                                 sational planning agent. Journal of Experimental & Theo-
   Laughter related to smoothing retrieval difficulties can be            retical Artificial Intelligence, 7(1):7–48, 1995.
indicative. Consider the case of language tutoring. In the
Anki “flashcard” app, the system provides users with a word             Jens Allwood. An activity based approach to pragmatics.
in one language on the front side of the card and the user                 1995.
should provide a translation. The user then gets the correct            Robert Atkey. Syntax and semantics of quantitative type the-
response from the back of the card and evaluates her own                  ory. In Proceedings of the 33rd Annual ACM/IEEE Sym-
response (was this card Hard, Good or Easy to recall). If                 posium on Logic in Computer Science, LICS 2018, Oxford,
we consider making a similar conversational app, indications              UK, pages 56–65, 2018.
of retrieval issues—filled pauses (“er em. . . ”) and follow-up         Christian Becker-Asano and Hiroshi Ishiguro. Laughter in
smoothing by laughter—can lead to the decision to flag this               social robotics-no laughing matter. In Intl. Workshop on
card as “Hard” and provide corresponding feedback (11).                   Social Intelligence Design, pages 287–300. Citeseer, 2009.
             S    What is the Swedish for donkey?                       Anastasia Bondarenko, Christine Howes, and Staffan Lars-
             U    er em . . . åsna?..                              son. Directory enquiries corpus, Feb 2020.
      (11)
             S    Yes, that was tough, but it is correct!
                  (system marks the card as “Hard”)                     Ivan Bratko. Prolog programming for artificial intelligence.
                                                                           Pearson education, 2001.
6.4   Approaches to evaluation                                          Ellen Breitholtz and Vladislav Maraev. How to put an ele-
Each of the aforementioned improvements has to be a sub-                   phant in the title: Modeling humorous incongruity with
ject for evaluation within the dialogue system. We expect to               topoi. In Proceedings of the 23rd Workshop on the Seman-
see that these improvements will be reflected in the following             tics and Pragmatics of Dialogue - Full Papers, London,
evaluation criteria.                                                       United Kingdom, September 2019. SEMDIAL.
   Some of the improvements would fall into an objective                Ellen Breitholtz. Enthymemes and Topoi in Dialogue: The
checklist-style criteria, like being able to understand laugh-             Use of Common Sense Reasoning in Conversation. Brill,
ter as negative feedback, or as a signal of surprise. The same             Leiden, The Netherlands, 2020.




                                                                   42
Herbert H Clark. Using language. Cambridge university                Vladislav Maraev, Jean-Philippe Bernardy, and Jonathan
  press, 1996.                                                         Ginzburg.    Dialogue management with linear logic:
                                                                       the role of metavariables in questions and clarifications.
Yu Ding, Ken Prepin, Jing Huang, Catherine Pelachaud, and
                                                                       Traitement Automatique des Langues (TAL), 61(3):43–67,
  Thierry Artières. Laughter animation synthesis. In Proc.
                                                                       2020.
  AAMS 2014, pages 773–780. International Foundation for
  Autonomous Agents and Multiagent Systems, 2014.                    Chris Martens. Programming Interactive Worlds with Linear
                                                                       Logic. PhD thesis, Carnegie Mellon University Pittsburgh,
Lucas Dixon, Alan Smaill, and Tracy Tsang. Plans, actions              PA, 2015.
  and dialogues using linear logic. Journal of Logic, Lan-
  guage and Information, 18(2):251–289, 2009.                        Chiara Mazzocconi. Laughter in interaction: semantics,
                                                                       pragmatics and child development. PhD thesis, Université
Pawel Dybala, Michal Ptaszynski, Rafal Rzepka, and Kenji               de Paris, 2019.
  Araki. Subjective, but ot worthless-on-linguistic features
  of chatterbot evaluations. In 6th IJCAI Workshop on                Nikola Mrkšić, Diarmuid Ó Séaghdha, Tsung-Hsien Wen,
  Knowledge and Reasoning in Practical Dialogue Systems,               Blaise Thomson, and Steve Young. Neural belief tracker:
  page 87. Citeseer, 2009.                                             Data-driven dialogue state tracking. In Proceedings of the
                                                                       55th Annual Meeting of the Association for Computational
Pawel Dybala, Michal Ptaszynski, Rafal Rzepka, and Kenji               Linguistics (Volume 1: Long Papers), pages 1777–1788,
  Araki. Extending the chain: humor and emotions in human              2017.
  computer interaction. International Journal of Computa-
  tional Linguistics Research, 1(3):116–125, 2010.                   Anton Nijholt. Embodied agents: A new impetus to humor
                                                                       research. In The April Fools Day Workshop on Compu-
Kevin El Haddad, Sandeep Nallan Chakravarthula, and James              tational Humour, volume 20, pages 101–111. In: Proc.
  Kennedy. Smile and laugh dynamics in naturalistic dyadic             Twente Workshop on Language Technology, 2002.
  interactions: Intensity levels, sequences and roles. In
  2019 International Conference on Multimodal Interaction,           Bill Noble and Vladislav Maraev. Large-scale text pre-
  pages 259–263, 2019.                                                  training helps with dialogue act recognition, but not with-
                                                                        out fine-tuning. In Proceedings of the 14th International
Jonathan Ginzburg, Chiara Mazzocconi, and Ye Tian. Laugh-               Conference on Computational Semantics - Short Papers,
  ter as language. Glossa: a journal of general linguistics,            Groningen, Netherlands, 2021.
  5(1), 2020.
                                                                     Magalie Ochs and Catherine Pelachaud. Socially aware
Jonathan Ginzburg. The Interactive Stance. Oxford Univer-             virtual characters: The social signal of smiles [so-
  sity Press, 2012.                                                   cial sciences].    IEEE Signal Processing Magazine,
J.-Y. Girard. Linear Logic: its syntax and semantics, page            30(2):128–132, Mar 2013.
   1–42. London Mathematical Society Lecture Note Series.            Cécile Petitjean and Esther González-Martínez. Laughing
   Cambridge University Press, 1995.                                   and smiling to manage trouble in french-language class-
                                                                       room interaction. Classroom Discourse, 6(2):89–106,
Phillip Glenn. Laughter in Interaction. Cambridge University
                                                                       2015.
  Press, Cambridge, UK, 2003.
                                                                     Fernando Poyatos. Paralanguage: A linguistic and interdis-
Kate S Hone and Robert Graham. Towards a tool for the                  ciplinary approach to interactive speech and sounds, vol-
  subjective assessment of speech system interfaces (sassi).           ume 92. John Benjamins Publishing, 1993.
  2000.
                                                                     Joshua Raclaw and Cecilia E Ford. Laughter and the man-
Christine Howes, Anastasia Bondarenko, and Staffan Lars-                agement of divergent positions in peer review interactions.
  son. Good call! Grounding in a Directory Enquiries Cor-               Journal of Pragmatics, 113:1–15, 2017.
  pus. In Proceedings of the 23rd Workshop on the Semantics
  and Pragmatics of Dialogue, London, United Kingdom,                Verena Rieser and Oliver Lemon. Reinforcement learning
  sep 2019. SEMDIAL.                                                   for adaptive dialogue systems: a data-driven methodology
                                                                       for dialogue management and natural language genera-
Gail Jefferson. On the organization of laughter in talk about          tion. Springer Science & Business Media, 2011.
  troubles. In Structures of Social Action: Studies in Con-
  versation Analysis, pages 346–369. 1984.                           Emanuel A Schegloff and Harvey Sacks. Opening up clos-
                                                                       ings. Semiotica, 8(4):289–327, 1973.
Kristiina Jokinen. Constructive dialogue modelling: Speech
  interaction and rational agents, volume 10. John Wiley &           Jérôme Urbain, Radoslaw Niewiadomski, Elisabetta Bevac-
  Sons, 2009.                                                           qua, Thierry Dutoit, Alexis Moinet, Catherine Pelachaud,
                                                                        Benjamin Picart, Joëlle Tilmanne, and Johannes Wagner.
D Jurafsky, E Shriberg, and D Biasca. Switchboard dialog                Avlaughtercycle. J. Multimodal User Interfaces, 4(1):47–
  act corpus. International Computer Science Inst. Berkeley             58, 2010.
  CA, Tech. Rep, 1997.
                                                                     Julia Vettin and Dietmar Todt. Laughter in conversation: Fea-
Staffan Larsson. Issue-based dialogue management. PhD                   tures of occurrence and acoustic structure. Journal of Non-
   thesis, University of Gothenburg, 2002.                              verbal Behavior, 28(2):93–115, 2004.




                                                                43
Jason D Williams, Kavosh Asadi, and Geoffrey Zweig. Hy-
   brid code networks: practical and efficient end-to-end di-
   alog control with supervised and reinforcement learning.
   arXiv preprint arXiv:1702.03274, 2017.
Steve Young, Milica Gašić, Simon Keizer, François Mairesse,
   Jost Schatzmann, Blaise Thomson, and Kai Yu. The hid-
   den information state model: A practical framework for
   POMDP-based spoken dialogue management. Computer
   Speech & Language, 24(2):150–174, 2010.




                                                                44