=Paper= {{Paper |id=Vol-2448/SSS19_Paper_Upload_211 |storemode=property |title=Evaluating Cognitive Bias in Two-Party and Multi-Party Spoken Interactions |pdfUrl=https://ceur-ws.org/Vol-2448/SSS19_Paper_Upload_211.pdf |volume=Vol-2448 |authors=Christina Alexandris |dblpUrl=https://dblp.org/rec/conf/aaaiss/Alexandris19 }} ==Evaluating Cognitive Bias in Two-Party and Multi-Party Spoken Interactions== https://ceur-ws.org/Vol-2448/SSS19_Paper_Upload_211.pdf
               Evaluating Cognitive Bias in Two-Party and Multi-Party
                                                 Spoken Interactions
                                                     Christina Alexandris
                                            National and Kapodistrian University of Athens
                                                         calexandris@gs.uoa.gr




                          Abstract                                    of the network concerned) or in (purposefully) creating ten-
  Targeting to by-pass Cognitive Bias in two-party discussions        sion in the interview or discussion. Furthermore, a con-
  and interviews containing longer speech segments, a pro-            sistent avoidance of the topics addressed may indicate that
  posed semi-automatic procedure involves “taking the temper-         the Speaker is more interested in showing a mere presence
  ature” of a transcribed dialog by measuring the number of
                                                                      in the discussion or interview, rather than sharing any infor-
  detected points of possible tension and/or conflict between
  speakers-participants.                                              mation.
                                                                         The existence of additional, “hidden” Illocutionary Acts
                                                                      can be identified, by procedures evaluating the behavior of
                       Introduction                                   speakers-participants in relation to specific values and
                                                                      benchmarks. The presentation and calculation of these val-
Human Computer Interaction (HCI) systems may assist in
                                                                      ues allows the possibility of by-passing or registering Cog-
the evaluation of complex Human-Human interaction, as in
                                                                      nitive Bias. The Cognitive Bias by-passed or registered con-
the case of designed applications for journalists (Alexandris ,
                                                                      cerns primarily the evaluation of Cognitive Bias of (i) the
Nottas and Cambourakis, 2015).
                                                                      speakers-participants concerned but may also serve for the
   In spoken dialogs concerning complex interactions be-
                                                                      evaluation of the Confidence Bias of (ii) the user-evaluator
tween speakers-participants -as in the case of spoken jour-
                                                                      of the recorded and transcribed discussion or interview.
nalistic texts, there are aspects that can be evaluated by
semi-automatic or interactive procedures, targeting to by-
pass Cognitive Bias and there are aspects that can be evalu-            By-Passing and Registering Cognitive Bias in
ated by interactive procedures, targeting to register Cogni-
                                                                         Multiple Speaker Discussions or in Short
tive Bias.
   Unlike task-specific dialogs (Tung et al., 2013) and typi-                        Speech Segments
cal collaborative dialogs (Yang, Levow and Meng, 2012,                In smaller speech segments with constant and quick change
Wang et al., 2013), the Speech Acts performed by one or               of speaker turns and with discourse structure compatible to
multiple speakers-participants, often may involve complex             models where each participant selects self (Wilson, 2005,
Illocutionary Acts beyond the defined framework of the in-            Sacks, Schegloff, and Jefferson 1974), topic tracking (and
teraction. Specifically, the Illocutionary Act (Searle,1969,          topic change) allows the evaluation of speaker behavior and
Austin, 1962) performed by the Speaker concerned may not              enables the identification of speaker’s intentions and Illocu-
be restricted to “Obtaining Information Asked” or “Provid-            tionary Speech Acts performed (Searle,1969, Austin, 1962).
ing Information Asked” in a discussion or interview: Speak-           Topic tracking can be applied especially in short speech seg-
ers-participants may have other or additional intentions re-          ments with two or multiple speakers -participants (Alexan -
garding their presence and their role in the discussion or in-        dris, 2018). The content of relatively short utterances can be
terview concerned. In the spoken journalistic texts con-              summarized with the use of keywords chosen from each ut-
cerned, Illocutionary Acts not restricted to “Obtaining Infor-        terance by the user-evaluator (Alexandris, 2018), with the
mation Asked” or “Providing Information Asked” are re-                assistance of the Stanford POS Tagger for the automatic sig-
lated to other or additional Speaker intentions. For example,         nalization of nouns in each turn taken by the Speakers in the
a Speaker may focus in emphasizing opinion (or the policy             respective segment in the dialog structure. The registered
and tracked keywords, treated as local variables, signalize        visioned further development of generated visual represen-
each topic and the relations between topics, since automatic       tations is their modeling in a form of graphs, similar to dis-
Rhetorical Structure Theory (RST) analysis procedures              course trees (Marcu, 1999, Carlson, Marcu and Okurowski,
(Stede, Taboada and Das, 2017, Zeldes, 2016) usually in-           2001).
volves larger (written) texts and may not produce the re-
quired results.                                                    Evaluation and Benchmarks
   The System generates a visual representation from the           The types of relations-distances between word-topics cho-
user’s interaction, tracking the corresponding selected topic-     sen by the user-evaluator are registered and counted. If the
keywords in the dialog flow, as well as the chosen types of        number of (a) the “Repetitions” label or (b) the number of
relations between them. The interactive generation of regis-       the “Generalizations” or (c) the number of the “Topic
tered paths is similar to the paths with generated sequences       Switches” exceeds well over 50% of the registered relations-
of recognized keywords in spoken dialog systems, in the do-        distances between word-topics, the interaction is signalized
mains of consumer complaints and mobile phone services             for further evaluation, containing Illocutionary Acts not re-
call centers (Nottas et al., 2007, Floros and Mourouzidis,         stricted to “Obtaining Information Asked” or “Providing In-
2016). This function is similar to user-independent evalua-        formation Asked”. The following benchmarks indicate in-
tions of spoken dialog systems (Williams, Asadi and Zweig,         teractions with Illocutionary Acts beyond the predefined
2017) for by-passing User bias (Nass and Brave, 2005, Co-          framework of the dialog for multiple Speaker discussions
hen, 1997). Keywords (topics) may be repeated or related to        and/or short speech segments, where Ds = Number of Dis-
a more general concept (or global variable) (Lewis, 2009) or       tances and Sp = Number of Speaker turns:
related to keywords (topics) concerning similar functions          • X= Ds ≤ Sp (calculating over 50% of “Repetitions” (Dis -
(corresponding to the Repetition, Generalization and Asso-            tance = 1, value “1”) ) or “Topic Switches” (Distance =
ciation relations respectively and the visual representations         4, value “0”).
of Distances 1 (value “1”),2 (value “2”) and 3 (value “3”)         • X= Ds > Sp × Gen (Gen = Sp × 3 ÷ 2) (calculating over
respectively) (Alexandris, 2018). A keyword involving a               50% of “Generalizations” (Distance = 3, value “3”).
new command or function is registered as a new topic (New
Topic, visual representation of Distance 4, corresponding to
                                                                   These benchmarks for dialogs with short speech segments
value: “0”). The sequence of topics chosen by the user and
                                                                   can be referred to as “(Topic) Relevance” benchmarks with
the perceived relations between them generates a “path” of
                                                                   a value of “X” or “Relevance (X)”.
interaction, forming distinctive visual representations stored
in a database currently under development: Topics and
words generating diverse reactions and choices from users           By-Passing and Registering Cognitive Bias in
result to the generation of different forms of generated vis-       Two-Party Discussions and Interviews or in
ual representations for the same conversation and interac-
tion (Alexandris, 2018).                                                      Long Speech Segments
   The generated visual representations depict topics              The further development of the database containing regis-
avoided, introduced or repeatedly referred to by each              tered spoken interaction for determining and evaluating
speaker-participant, and in specific types of cases may indi-      Cognitive Bias in spoken journalistic texts (Alexandris ,
cate the existence of additional, “hidden” Illocutionary Acts      2018) involved the processing of discussions and interviews
other than “Obtaining Information Asked” or “Providing In-         containing larger speech segments. Similarly to the above-
formation Asked” in a discussion or interview. Thus, the           described multiple speaker discussions and in short speech
evaluation of speaker-participant behavior targets to by-pass      segments, the Illocutionary Act performed by the Speaker
Cognitive Bias, specifically, Confidence Bias (Hilbert ,           concerned may not be restricted to “Obtaining Information
2012) of the user-evaluator, especially if multiple users -        Asked” or “Providing Information Asked” in a discussion or
evaluators may produce different forms of generated visual         interview.
representations for the same conversation and interaction             In two-party discussions and interviews containing longer
and compared to each other in the database. In this case,          speech segments, the discourse structure is more compatible
chosen relations between topics may describe Lexical Bias,         to turn-taking in “push-to-talk conversations”, with a strict
(Trofimova, 2014) and may differ according to political, so-       protocol in managing the interview or discussion and turn -
cio-cultural and linguistic characteristics of the user-evalua-    taking (Taboada, 2006). In this case, speakers-participants
tor, especially if international users are concerned (Yu et al.,   usually not have the liberty of modifying or changing the
2010, Alexandris, 2010, Ma, 2010, Pan, 2000) due to to lack        topic, resulting to the strategy of topic tracking being insuf-
of world knowledge of the language community involved              ficient for the identification of speaker’s intentions. In larger
(Paltridge, 2012, Hatim, 1997, Wardhaugh, 1992). The en-           speech segments mostly occurring in interviews with a strict
protocol and a set of predefined topics, automatic Rhetorical    last 60 words of the first speaker’s (Speaker 1) utterance are
Structure Theory (RST) analysis procedures (Stede,               processed (approximately 1 -3 sentences, depending on
Taboada and Das, 2017, Zeldes, 2016) can be performed in         length). The automatically signalized “hot spots” are ex-
the transcribed text, with the condition that the speaker is     tracted to a separate template for further processing. The ex-
allowed sufficient time to elaborate on the topic in question.   traction contains not only the detected segments but also the
The extent to which automatic RST analysis procedures can        complete utterances consisting of both speaker turns of
be executed in the transcribed text indicates the degree of      Speaker 1 and Speaker 2.
collaborative interaction between the speakers -participants,        For a segment of speaker turns to be automatically iden-
especially from the journalist-interviewer (referred to as       tified as a “hot spot”, at least two of the following three con-
Speaker 1), since the speaker-participant is allocated enough    ditions (1), (2) and (3) must apply to one or to both of the
time to elaborate and/or argument on the topic concerned.        speaker’s utterances, of which conditions (1), (2) are di-
   In the case of discussions and interviews containing larger   rectly or indirectly related to flouting of Maxims of the Gri-
speech segments, the identification of speaker’s intentions      cean Cooperative Principle (Grice, 1975). These conditions
and “hidden” Illocutionary Act detection follows a process       are the following:
locating points of possible tension and/or conflict between      • (1) Additional, modifying features: In one or in both
speakers-participants. In points of possible tension and/or          speakers’ utterances in the segment of speaker turns there
conflict between speakers-participants, Cognitive Bias can           is at least one phrase containing a sequence of two adjec-
both be by-passed or registered. Cognitive Bias is by-passed         tives (ADJ ADJ) (a) or an adverb and an adjective (or
by signalizing and counting the points of possible tension           more adjectives) (b) (ADV ADJ) or two adverbs (ADV
and/or conflict between speakers-participants henceforth re-         ADV) (c). These forms of adjectival or adverbial phrases
                                                                     are detectable with a POS Tagger (for example, the Stan-
ferred to as “hotspots”. The signalization of “hotspots” is
                                                                     ford POS Tagger.
based on the violation of the Quantity, Quality and Manner
                                                                 • (2) Reference to the interaction itself and to its partici-
Maxims of the Gricean Cooperativity Principle (Grice,
                                                                     pants with negation. In one or in both speakers’ utter-
1975). Cognitive Bias is registered by comparing content of
                                                                     ances, the subject of the sentence containing the negation
the Speaker turns in the signalized “hotspots” and assigning         is “I” or “you” ((I/You) “don’t”, “do not”,“cannot”) (a)
a respective value.                                                  and in the verb phrase (VP) there is at least one speech-
                                                                     related or behavior verb-stem referring to the dialog itself
By-Passing Cognitive Bias: Automatic Signaliza-                      (b) (for example, “speak”, “listen”, “guess”, “under-
tion of “Hotspots” and the Gricean Cooperativity                     stand”). This applies to parts of speech other than verbs
Principle                                                            (i.e. “guessing”, “listener”) as well as to words constitut-
                                                                     ing parts of expressions related to speech or behavior
Targeting to by-pass Cognitive Bias in two-party discus-             (“conclusions”, “words”, “mouth”, “polite”, “nonsense”,
sions and interviews containing longer speech segments , a           “manners”). The different forms of negation are detecta-
proposed semi-automatic procedure involves “taking the               ble with a POS Tagger. The respective words and word
temperature” of a transcribed dialog by measuring the num-           categories may constitute a small set of entries in a spe-
ber of detected points of possible tension and/or conflict be-       cially created lexicon or may be retrieved from existing
tween speakers-participants. These points are henceforth,            databases or WordNets .
referred to as “hot spots” and concern in speech segments        • (3) Prosodic emphasis and/or Exclamations. (a) Exclama-
where there is a recognition of speaker turns, namely a              tions include expressions such as such as “Look”, “Wait”
switch between Speaker 1 and Speaker 2 by the Speech                 and “Stop”. As in the above-described case (2), the re-
recognition module of the transcription tool. The signaliza-         spective words and word categories may constitute a
                                                                     small set of entries in a specially created lexicon or may
tion of multiple “hot spots” indicates a more argumentative
                                                                     be retrieved from existing databases or WordNets. (b)
than a collaborative interaction, even if speakers -partici-
                                                                     Prosodic emphasis, detected in the speech processing
pants display a calm and composed behavior. In particular,           module, may occur in one or more of the above-described
the Illocutionary Act performed by the Speaker concerned             words of categories (1a, 1b, 1c, 2a and 2b) or in the noun
may not be restricted to “Obtaining Information Asked” or            or verb following (modified by) 1a, 1b and 1c.
“Providing Information Asked” in a discussion or interview.
                                                                 In the case of 1a, 1b and 1c, there is extra information added
   A “hot spot” consists of the pair of utterances of both
                                                                 to the basic content of the utterance consisting the necessary
speakers, namely a question-answer pair or a statement-re-
                                                                 information required to fulfil the Gricean Cooperative Prin -
sponse pair or any other type of relation between speaker
                                                                 ciple in respect to the Maxim of Quantity. (“Do not make
turns. In longer utterances, the first 60 words of the second
                                                                 your contribution more informative than is required “).
speaker’s (Speaker 2) utterance are processed (approxi-
                                                                 Here, the Speaker violates the Maxim of Quantity in the Gri-
mately 1 -3 sentences, depending on length, with the aver-
                                                                 cean Cooperative Principle. In the case of 2a and 2b, the
age sentence length of 15-20 words, (Cutts 2013) and the
                                                                 Speaker perceives a violation of the Gricean Cooperative
Principle by the previous Speaker. In particular, the content     Y = wav file length in minutes divided by (÷) the number of
of the speaker’s utterance is not limited to the current topic    “hot spot” signalized speech segments:
in question but refers to the dialog itself, mostly functioning   • Y < 10.
as a comment. Specifically, 2a and 2b imply a violation of        • Example: File length = 35 mins, SPEECH SEGM ENT -
the Gricean Cooperative Principle in respect to the Maxim           count: 5, Evaluation: 7.
of Quality (“1. Do not say what you believe to be false”, “2.     These benchmarks for dialogs with long speech segments
Do not say that for which you lack adequate evidence”)            can be referred to as “Tension” benchmarks with a value of
(Grice, 1975) and/or in respect to the Maxim of Manner            “Y” or “Tension (Y)”.
(Submaxim 2. “Avoid ambiguity”) (Grice, 1975) in the ut-
terance of the previous Speaker. In other words, in 2a and
2b, the Speaker considers the content of the previous              Registering Cognitive Bias: Interactive Com-
Speaker’s utterance to be unacceptable, ambiguous, false or                  parison of Speaker Turns
controversial.
   The number of automatically signalized ”hot spots” indi-       The registration of Cognitive Bias concerns the comparison
cates the degree in which discussions and interviews con-         of the actual content of the pair of utterances of Speaker 1
taining larger speech segments constitute dialog with many        and Speaker 2 in the signalized “hot spots”. As stated above,
points of tension and/or conflict. The average time of dis-       the automatically signalized ”hot spots” are extracted to a
cussions and interviews containing larger speech segments         separate template for interactive processing, where the “hot
in the Media is 30 to 45 minutes (30-45 mins). A typical          spot” utterances of both speakers are compared. If the last
example of a dialog with many detected points of possible         60 words (approximately 1 -3 sentences, with average sen-
tension and/or conflict between speakers -participants is an      tence length of 15-20 words, (Cutts 2013) of the first
approximately 32 minute long interview with seven (7) reg-        speaker’s utterance contain at least two of the above-de-
istered “hot spots” (BBC (British Broadcasting Corpora-           scribed features (1), (2) and (3), the Quantity and Manner
tion): HARDtalk interview by journalist Stephen Sackur on         Maxims of the Gricean Cooperative Principle (Grice, 1975)
16th of April 2018). One or both speakers’ utterances may         are violated. Specifically, Cognitive Bias is registered by
display two or more of features (1), (2) and (3).                 comparing content of the Speaker turns in the signalized
                                                                  “hotspots” and assigning the following respective values :
Evaluation and Benchmarks                                         • (a) Each “hot spot” is marked with a (1,1) if both speak-
                                                                     ers’ utterances are considered equally non-collaborative.
The benchmark for evaluating a remarkable degree of ten-
sion in a discussion is signalized by multiple “hotspots” de-     • (b) If this is the case for one of the two speakers, in par-
                                                                     ticular, Speaker 1, the “hot spot” is marked with a (1,0)
tected and not sporadic occurrences of “hotspots”. Thus, the
                                                                     for Speaker 1 (in this case, the journalist-reporter). In this
number of 1-2 “hotspot” occurrences in longer speech seg-            case, the style of question or statement uttered is not con-
ments in question (30-45 mins) signalizes a low degree of            sidered acceptable- contains features violating the Gri-
tension. A remarkable degree of tension in a 30-45 minute            cean Cooperative Principle - in respect to the Maxim of
discussion or interview is related to a number of at least 4         Manner or the Maxim of Quantity (Irony) or in respect to
detected “hotspots” (where the number of 3 hotspots consti-          the Maxim of Quality (content is considered false (“F”).
tutes a marginal value). Considering the above, the bench-        • (c) If this is the case for Speaker 2, the “hot spot” is
mark for evaluating a remarkable degree of tension concerns          marked with a (0,1), if the interviewee’s (Speaker 2) re-
the calculation of the time of discussion / interview in the         action is not justified in respect to the style and content of
Media (for example, 35 mins) and the number of signalized            the utterance of Speaker 1.
”hot spots” (SPEECH SEGMENT-count) in Speaker turns.              • (d) If a “hot spot” speech segment is evaluated by the User
The defined benchmark for evaluating Speaker behavior is             not as a point of possible tension and/or conflict between
the number of minutes divided by the number of identified            speakers-participants, the false “hot spot” is marked with
speech segments signalized as “hot spots” which should               a (0,0) for both Speakers.
contain a single digit number, if the above-described mini-
mal number of at least 4 detected “hotspots” is calculated.       Evaluation and Benchmarks
For example, the acceptable values are “8.75”, “7” or, ide-        Both Speakers may have an equal number of a grading of
ally, “5” (for a file of 35 minutes) versus “17.5” or “11.6”      “1” in all extracted “hot spots” detected or one of the Speak-
(for a file of 35 minutes). Interactions with Illocutionary       ers may have a slightly higher/lower or a considerably
Acts beyond the predefined framework of the dialog-discus-        higher/lower grading of “1”. A grading of “1” in 50% or
sion (with two speakers –participants and long speech seg-        more of the “hot spots” signalizes that the Illocutionary Act
ments) are based on the detected points of possible tension       performed by the Speaker concerned is not restricted to “Ob-
and/or conflict indicated by the following benchmark, where       taining Information Asked” or “Providing Informatio n
Asked”. Speaker behavior indicating that Illocutionary Acts           Alexandris, C. 2010. English, German and the International “Semi-
performed are not restricted to the predefined interaction            professional” Translator: A M orphological Approach to Implied
                                                                      Connotative Features. Journal of Language and Translation, Sep-
framework is evaluated by the following benchmarks, where             tember 2010, Vol. 11, 2, Sejong University, Korea: 7- 46.
Z = the number of “hot spot” signalized speech segments
                                                                      Austin J. L. 1962. How to Do Things with Words. Urmson. J.O.
divided by (÷): 2 (50%):                                              and Sbisà, M . eds., 2nd edition., 1976, Oxford, UK: Oxford Uni-
• Sum of Speaker grades ≥ Z.                                          versity Press.
• Example: Evaluation of Speaker Behavior (Speaker 1 is               Carlson, L.; M arcu, D. ; and Okurowski, M . E. 2001. Building a
   less collaborative than Speaker 2).                                Discourse-Tagged Corpus in the Framework of Rhetorical Struc-
• SPEAKER1: (1), (1), (1), (0), (1).                                  ture Theory. In Proceedings of the 2nd SIGDIAL Workshop on Dis-
                                                                      course and Dialogue, Eurospeech 2001, Denmark, September
• SPEAKER2: (0), (0), (1), (1), (0).                                  2001, ACL Anthology, W01-16.
• File length: 35 mins: SPEECH-SEGM ENT-count “hot                    Cohen, P. ; Johnston, M . ; M cGee, D. ; Oviatt, S. ; Pittman, J. ;
   spots”: 5 (sum of grades =6, 6 ≥ Z where Z = 2.5).                 Smith, I. ; Chen, L. ; and Clow, J. 1997. Quickset: M ultimodal
These benchmarks for dialogs with long speech segments                Interaction for Distributed Applications. In Proceedings of the 5th
can be referred to as “Collaboration” benchmarks with a               ACM International Multimedia Conference, 31-40, New York,
                                                                      NY: ACM Digital Library.
value of “Z” or “Collaboration (Z)”.
                                                                      Cutts, M . 2013. Oxford Guide to Plain English. 4th edition., Ox-
                                                                      ford, UK: Oxford University Press.
        Conclusions and Further Research                              Du, J. ; Alexandris, C. ; M ourouzidis, D. ; Floros, V. ; and Iliakis,
                                                                      A. 2017. Controlling Interaction in M ultilingual Conversation Re-
By-passing and registering Cognitive Bias in HCI systems              visited: A Perspective for Services and Interviews in M andarin
assisting in the evaluation of Human-Human interaction in-            Chinese. In Kurosu, M . ed., Lecture Notes in Computer Science
                                                                      LNCS 10271: 573–583, Heidelberg, Germany: Springer.
volves both automatic and interactive procedures. Interac-
tive topic tracking in the dialog structure and automatic “hot        Floros, V. and M ourouzidis, D. 2016. M ultiple Task M anagement
                                                                      in a Dialog System for Call Centers. M aster’s Thesis, Department
spot” generation involving points of tension and/or conflict          of Informatics and Telecommunications, National University of
contribute to an evaluation of speakers -participants behavior        Athens, Greece.
and intentions during the interaction.                                Grice, H.P. 1975. Logic and conversation. In: Cole, P., M organ, J.
   The behavior and Cognitive Bias of (i) speakers -partici-          eds., Syntax and Semantics, Vol. 3. Academic Press, New York.
pants is evaluated in relation to the values of the “Relevance        Hatim, B. 1997. Communication Across Cultures: Translation
(X)”, “Tension (Y)” and “Collaboration (Z)” benchmarks.               Theory and Contrastive Text Linguistics. Exeter, UK: University
However, the same benchmarks may be used for evaluating               of Exeter Press.
the Cognitive Bias- Confidence Bias of (ii) the user-evalua-          Hilbert, M . 2012. Toward a Synthesis of Cognitive Biases: How
tor of the recorded and transcribed discussion or interview.          Noisy Information Processing Can Bias Human Decision M aking.
                                                                      Psychological Bulletin, Vol 138(2), M ar 2012: 211-237.
   Spoken dialogs concerning complex interactions between
speakers-participants are not limited to spoken journalistic          Lewis, J.R. 2009. Introduction to Practical Speech User Interface
                                                                      Design for Interactive Voice Response Applications, IBM Soft-
texts. As the variety and complexity of spoken HCI applica-           ware Group, USA, Tutorial T09 presented at HCI 2009 San Diego,
tions increases, Speech Acts performed by one or multiple             CA, USA
users-participants, even by the System itself, often may in-          M a, J. 2010. A comparative analysis of the ambiguity resolution of
volve Illocutionary Acts beyond the predefined framewo rk             two English-Chinese M T approaches: RBM T and SM T. Dalian
of a task-oriented dialog, especially in systems with emotion         University of Technology Journal, 31(3): 114-119.
recognition, virtual negotiation, psychological support or            M arcu, D. 1999. Discourse trees are good indicators of importance
decision-making.                                                      in text. In M ani, I. and M aybury, M . (eds), Advances in Automatic
                                                                      Text Summarization, Cambridge M A, The M IT Press: 123-136.
                                                                      Nass, C. and Brave, S. 2005. Wired for Speech: How Voice Acti-
                         References                                   vates and Advances the Human-Computer Relationship. Cam-
                                                                      bridge M A: The M IT Press.
Alexandris, C. 2018. M easuring Cognitive Bias in Spoken Interac-     Nottas, M . ; Alexandris, C. ; Tsopanoglou, A. ; and Bakamidis, S.
tion and Conversation: Generating Visual Representations. In: Be-
                                                                      2007. A Hybrid Approach to Dialog Input in the CitzenShield Di-
yond M achine Intelligence: Understanding Cognitive Bias and Hu-
                                                                      alog System for Consumer Complaints. In Proceedings of HCII
manity for Well-Being AI. In Proceedings from the AAAI Spring         2007, Beijing, Peoples Republic of China.
Symposium, Stanford University, 204-206 Technical Report, SS-
18-03, Palo Alto, CA: AAAI Press.                                     Paltridge, B. 2012. Discourse Analysis: An Introduction. London,
                                                                      UK: Bloomsbury Publishing.
Alexandris, C. ; Nottas, M . ; and Cambourakis, G. 2015. Interac-
tive Evaluation of Pragmatic Features in Spoken Journalistic Texts.   Pan, Y. 2000. Politeness in Chinese Face-to-Face Interaction. Ad-
In Kurosu, M . ed., Human-Computer Interaction, Users and Con-        vances in Discourse Processes Series Vol. 67, Stamford, CT:
texts, LNCS Lecture Notes in Computer Science, Vol. 9171: 259-        Ablex Publishing Corporation.
268, Heidelberg, Germany: Springer.
Sacks, H. ; Schegloff, E. A. ; and Jefferson, G. 1974. A simplest
systematics for the organization of turn-taking for conversation.
Language, Vol. 50: 696-735.
Searle J. R. 1969. Speech Acts: An Essay in the Philosophy of Lan-
guage. Cambridge, M A: Cambridge University Press.
Taboada, M . 2006. Spontaneous and non-spontaneous turn-taking.
Pragmatics, Vol.16 (2-3): 329-360.
Stede, M . ; Taboada, M ., ; and Das, D. 2017. Annotation Guide-
lines for Rhetorical Structure. M anuscript. University of Potsdam
and Simon Fraser University. M arch 2017.
Trofimova I. 2014. Observer Bias: An Interaction of Temperament
Traits with Biases in the Semantic Perception of Lexical M aterial.
PLoS ONE 9(1): e85677.
Tung, T. ; Gomez, R. ; Kawahara, T. ; and M atsuyama, T. 2013.
M ulti-party Human-M achine Interaction Using a Smart M ulti-
modal Digital Signage. In Kurosu, M . ed., Human-Computer In-
teraction. Interaction Modalities and Techniques, Lecture Notes in
Computer Science, Vol. 8007, 2013, 408-415, Heidelberg, Ger-
many: Springer.
Wang, H. ; Gailliot, A. ; Hyden, D. ; and Lietzenmayer, R. 2013.
A Knowledge Elicitation Study for Collaborative Dialogue Strate-
gies Used to Handle Uncertainties in Speech Communication
While Using GIS. In Kurosu, M . ed., Human-Computer Interac-
tion, Lecture Notes in Computer Science 8007, 135-146, Heidel-
berg, Germany: Springer.
Wardhaugh, R. 1992. An Introduction to Sociolinguistics. 2nd edi-
tion. Oxford, UK: Blackwell.
Williams, J.D. ; Asadi, K. ; and Zweig, G. 2017. Hybrid Code
Networks: practical and efficient end-to-end dialog control with
supervised and reinforcement learning. In Proceedings of the 55th
Annual Meeting of the Association for Computational Linguistics,
Vancouver, Canada, July 30 - August 4, 2017, 665–677, Associa-
tion for Computational Linguistics -ACL.
Wilson, K. E. 2005. An oscillator model of the timing of turn-
taking. Psychonomic Bulletin and Review 2005:12 (6): 957-968.
Yang, Z. ; Levow G.A. ; and M eng F. H. 2012. Predicting User
Satisfaction in Spoken Dialog System Evaluation With Collabora-
tive Filtering. IEEE Journal of Selected Topics in Signal Pro-
cessing, Vol. 6, Issue: 8, Dec. 2012: 971 – 981.
Yu, Z. ; Yu, Z. ; Aoyama, H. ; Ozeki, M .; and Nakamura, Y. 2010.
Capture, Recognition, and Visualization of Human Semantic Inter-
actions in M eetings. In Proceedings of PerCom, Mannheim, Ger-
many, 2010.
Zeldes, A. 2016. "rstWeb - A Browser-based Annotation Interface
for Rhetorical Structure Theory and Discourse Relations". In Pro-
ceedings of NAACL-HLT 2016 System Demonstrations, San Diego,
CA, 1-5, ACL Anthology, NAACL-HLT 2016.