=Paper=
{{Paper
|id=Vol-2448/SSS19_Paper_Upload_211
|storemode=property
|title=Evaluating Cognitive Bias in Two-Party and Multi-Party Spoken Interactions
|pdfUrl=https://ceur-ws.org/Vol-2448/SSS19_Paper_Upload_211.pdf
|volume=Vol-2448
|authors=Christina Alexandris
|dblpUrl=https://dblp.org/rec/conf/aaaiss/Alexandris19
}}
==Evaluating Cognitive Bias in Two-Party and Multi-Party Spoken Interactions==
Evaluating Cognitive Bias in Two-Party and Multi-Party Spoken Interactions Christina Alexandris National and Kapodistrian University of Athens calexandris@gs.uoa.gr Abstract of the network concerned) or in (purposefully) creating ten- Targeting to by-pass Cognitive Bias in two-party discussions sion in the interview or discussion. Furthermore, a con- and interviews containing longer speech segments, a pro- sistent avoidance of the topics addressed may indicate that posed semi-automatic procedure involves “taking the temper- the Speaker is more interested in showing a mere presence ature” of a transcribed dialog by measuring the number of in the discussion or interview, rather than sharing any infor- detected points of possible tension and/or conflict between speakers-participants. mation. The existence of additional, “hidden” Illocutionary Acts can be identified, by procedures evaluating the behavior of Introduction speakers-participants in relation to specific values and benchmarks. The presentation and calculation of these val- Human Computer Interaction (HCI) systems may assist in ues allows the possibility of by-passing or registering Cog- the evaluation of complex Human-Human interaction, as in nitive Bias. The Cognitive Bias by-passed or registered con- the case of designed applications for journalists (Alexandris , cerns primarily the evaluation of Cognitive Bias of (i) the Nottas and Cambourakis, 2015). speakers-participants concerned but may also serve for the In spoken dialogs concerning complex interactions be- evaluation of the Confidence Bias of (ii) the user-evaluator tween speakers-participants -as in the case of spoken jour- of the recorded and transcribed discussion or interview. nalistic texts, there are aspects that can be evaluated by semi-automatic or interactive procedures, targeting to by- pass Cognitive Bias and there are aspects that can be evalu- By-Passing and Registering Cognitive Bias in ated by interactive procedures, targeting to register Cogni- Multiple Speaker Discussions or in Short tive Bias. Unlike task-specific dialogs (Tung et al., 2013) and typi- Speech Segments cal collaborative dialogs (Yang, Levow and Meng, 2012, In smaller speech segments with constant and quick change Wang et al., 2013), the Speech Acts performed by one or of speaker turns and with discourse structure compatible to multiple speakers-participants, often may involve complex models where each participant selects self (Wilson, 2005, Illocutionary Acts beyond the defined framework of the in- Sacks, Schegloff, and Jefferson 1974), topic tracking (and teraction. Specifically, the Illocutionary Act (Searle,1969, topic change) allows the evaluation of speaker behavior and Austin, 1962) performed by the Speaker concerned may not enables the identification of speaker’s intentions and Illocu- be restricted to “Obtaining Information Asked” or “Provid- tionary Speech Acts performed (Searle,1969, Austin, 1962). ing Information Asked” in a discussion or interview: Speak- Topic tracking can be applied especially in short speech seg- ers-participants may have other or additional intentions re- ments with two or multiple speakers -participants (Alexan - garding their presence and their role in the discussion or in- dris, 2018). The content of relatively short utterances can be terview concerned. In the spoken journalistic texts con- summarized with the use of keywords chosen from each ut- cerned, Illocutionary Acts not restricted to “Obtaining Infor- terance by the user-evaluator (Alexandris, 2018), with the mation Asked” or “Providing Information Asked” are re- assistance of the Stanford POS Tagger for the automatic sig- lated to other or additional Speaker intentions. For example, nalization of nouns in each turn taken by the Speakers in the a Speaker may focus in emphasizing opinion (or the policy respective segment in the dialog structure. The registered and tracked keywords, treated as local variables, signalize visioned further development of generated visual represen- each topic and the relations between topics, since automatic tations is their modeling in a form of graphs, similar to dis- Rhetorical Structure Theory (RST) analysis procedures course trees (Marcu, 1999, Carlson, Marcu and Okurowski, (Stede, Taboada and Das, 2017, Zeldes, 2016) usually in- 2001). volves larger (written) texts and may not produce the re- quired results. Evaluation and Benchmarks The System generates a visual representation from the The types of relations-distances between word-topics cho- user’s interaction, tracking the corresponding selected topic- sen by the user-evaluator are registered and counted. If the keywords in the dialog flow, as well as the chosen types of number of (a) the “Repetitions” label or (b) the number of relations between them. The interactive generation of regis- the “Generalizations” or (c) the number of the “Topic tered paths is similar to the paths with generated sequences Switches” exceeds well over 50% of the registered relations- of recognized keywords in spoken dialog systems, in the do- distances between word-topics, the interaction is signalized mains of consumer complaints and mobile phone services for further evaluation, containing Illocutionary Acts not re- call centers (Nottas et al., 2007, Floros and Mourouzidis, stricted to “Obtaining Information Asked” or “Providing In- 2016). This function is similar to user-independent evalua- formation Asked”. The following benchmarks indicate in- tions of spoken dialog systems (Williams, Asadi and Zweig, teractions with Illocutionary Acts beyond the predefined 2017) for by-passing User bias (Nass and Brave, 2005, Co- framework of the dialog for multiple Speaker discussions hen, 1997). Keywords (topics) may be repeated or related to and/or short speech segments, where Ds = Number of Dis- a more general concept (or global variable) (Lewis, 2009) or tances and Sp = Number of Speaker turns: related to keywords (topics) concerning similar functions • X= Ds ≤ Sp (calculating over 50% of “Repetitions” (Dis - (corresponding to the Repetition, Generalization and Asso- tance = 1, value “1”) ) or “Topic Switches” (Distance = ciation relations respectively and the visual representations 4, value “0”). of Distances 1 (value “1”),2 (value “2”) and 3 (value “3”) • X= Ds > Sp × Gen (Gen = Sp × 3 ÷ 2) (calculating over respectively) (Alexandris, 2018). A keyword involving a 50% of “Generalizations” (Distance = 3, value “3”). new command or function is registered as a new topic (New Topic, visual representation of Distance 4, corresponding to These benchmarks for dialogs with short speech segments value: “0”). The sequence of topics chosen by the user and can be referred to as “(Topic) Relevance” benchmarks with the perceived relations between them generates a “path” of a value of “X” or “Relevance (X)”. interaction, forming distinctive visual representations stored in a database currently under development: Topics and words generating diverse reactions and choices from users By-Passing and Registering Cognitive Bias in result to the generation of different forms of generated vis- Two-Party Discussions and Interviews or in ual representations for the same conversation and interac- tion (Alexandris, 2018). Long Speech Segments The generated visual representations depict topics The further development of the database containing regis- avoided, introduced or repeatedly referred to by each tered spoken interaction for determining and evaluating speaker-participant, and in specific types of cases may indi- Cognitive Bias in spoken journalistic texts (Alexandris , cate the existence of additional, “hidden” Illocutionary Acts 2018) involved the processing of discussions and interviews other than “Obtaining Information Asked” or “Providing In- containing larger speech segments. Similarly to the above- formation Asked” in a discussion or interview. Thus, the described multiple speaker discussions and in short speech evaluation of speaker-participant behavior targets to by-pass segments, the Illocutionary Act performed by the Speaker Cognitive Bias, specifically, Confidence Bias (Hilbert , concerned may not be restricted to “Obtaining Information 2012) of the user-evaluator, especially if multiple users - Asked” or “Providing Information Asked” in a discussion or evaluators may produce different forms of generated visual interview. representations for the same conversation and interaction In two-party discussions and interviews containing longer and compared to each other in the database. In this case, speech segments, the discourse structure is more compatible chosen relations between topics may describe Lexical Bias, to turn-taking in “push-to-talk conversations”, with a strict (Trofimova, 2014) and may differ according to political, so- protocol in managing the interview or discussion and turn - cio-cultural and linguistic characteristics of the user-evalua- taking (Taboada, 2006). In this case, speakers-participants tor, especially if international users are concerned (Yu et al., usually not have the liberty of modifying or changing the 2010, Alexandris, 2010, Ma, 2010, Pan, 2000) due to to lack topic, resulting to the strategy of topic tracking being insuf- of world knowledge of the language community involved ficient for the identification of speaker’s intentions. In larger (Paltridge, 2012, Hatim, 1997, Wardhaugh, 1992). The en- speech segments mostly occurring in interviews with a strict protocol and a set of predefined topics, automatic Rhetorical last 60 words of the first speaker’s (Speaker 1) utterance are Structure Theory (RST) analysis procedures (Stede, processed (approximately 1 -3 sentences, depending on Taboada and Das, 2017, Zeldes, 2016) can be performed in length). The automatically signalized “hot spots” are ex- the transcribed text, with the condition that the speaker is tracted to a separate template for further processing. The ex- allowed sufficient time to elaborate on the topic in question. traction contains not only the detected segments but also the The extent to which automatic RST analysis procedures can complete utterances consisting of both speaker turns of be executed in the transcribed text indicates the degree of Speaker 1 and Speaker 2. collaborative interaction between the speakers -participants, For a segment of speaker turns to be automatically iden- especially from the journalist-interviewer (referred to as tified as a “hot spot”, at least two of the following three con- Speaker 1), since the speaker-participant is allocated enough ditions (1), (2) and (3) must apply to one or to both of the time to elaborate and/or argument on the topic concerned. speaker’s utterances, of which conditions (1), (2) are di- In the case of discussions and interviews containing larger rectly or indirectly related to flouting of Maxims of the Gri- speech segments, the identification of speaker’s intentions cean Cooperative Principle (Grice, 1975). These conditions and “hidden” Illocutionary Act detection follows a process are the following: locating points of possible tension and/or conflict between • (1) Additional, modifying features: In one or in both speakers-participants. In points of possible tension and/or speakers’ utterances in the segment of speaker turns there conflict between speakers-participants, Cognitive Bias can is at least one phrase containing a sequence of two adjec- both be by-passed or registered. Cognitive Bias is by-passed tives (ADJ ADJ) (a) or an adverb and an adjective (or by signalizing and counting the points of possible tension more adjectives) (b) (ADV ADJ) or two adverbs (ADV and/or conflict between speakers-participants henceforth re- ADV) (c). These forms of adjectival or adverbial phrases are detectable with a POS Tagger (for example, the Stan- ferred to as “hotspots”. The signalization of “hotspots” is ford POS Tagger. based on the violation of the Quantity, Quality and Manner • (2) Reference to the interaction itself and to its partici- Maxims of the Gricean Cooperativity Principle (Grice, pants with negation. In one or in both speakers’ utter- 1975). Cognitive Bias is registered by comparing content of ances, the subject of the sentence containing the negation the Speaker turns in the signalized “hotspots” and assigning is “I” or “you” ((I/You) “don’t”, “do not”,“cannot”) (a) a respective value. and in the verb phrase (VP) there is at least one speech- related or behavior verb-stem referring to the dialog itself By-Passing Cognitive Bias: Automatic Signaliza- (b) (for example, “speak”, “listen”, “guess”, “under- tion of “Hotspots” and the Gricean Cooperativity stand”). This applies to parts of speech other than verbs Principle (i.e. “guessing”, “listener”) as well as to words constitut- ing parts of expressions related to speech or behavior Targeting to by-pass Cognitive Bias in two-party discus- (“conclusions”, “words”, “mouth”, “polite”, “nonsense”, sions and interviews containing longer speech segments , a “manners”). The different forms of negation are detecta- proposed semi-automatic procedure involves “taking the ble with a POS Tagger. The respective words and word temperature” of a transcribed dialog by measuring the num- categories may constitute a small set of entries in a spe- ber of detected points of possible tension and/or conflict be- cially created lexicon or may be retrieved from existing tween speakers-participants. These points are henceforth, databases or WordNets . referred to as “hot spots” and concern in speech segments • (3) Prosodic emphasis and/or Exclamations. (a) Exclama- where there is a recognition of speaker turns, namely a tions include expressions such as such as “Look”, “Wait” switch between Speaker 1 and Speaker 2 by the Speech and “Stop”. As in the above-described case (2), the re- recognition module of the transcription tool. The signaliza- spective words and word categories may constitute a small set of entries in a specially created lexicon or may tion of multiple “hot spots” indicates a more argumentative be retrieved from existing databases or WordNets. (b) than a collaborative interaction, even if speakers -partici- Prosodic emphasis, detected in the speech processing pants display a calm and composed behavior. In particular, module, may occur in one or more of the above-described the Illocutionary Act performed by the Speaker concerned words of categories (1a, 1b, 1c, 2a and 2b) or in the noun may not be restricted to “Obtaining Information Asked” or or verb following (modified by) 1a, 1b and 1c. “Providing Information Asked” in a discussion or interview. In the case of 1a, 1b and 1c, there is extra information added A “hot spot” consists of the pair of utterances of both to the basic content of the utterance consisting the necessary speakers, namely a question-answer pair or a statement-re- information required to fulfil the Gricean Cooperative Prin - sponse pair or any other type of relation between speaker ciple in respect to the Maxim of Quantity. (“Do not make turns. In longer utterances, the first 60 words of the second your contribution more informative than is required “). speaker’s (Speaker 2) utterance are processed (approxi- Here, the Speaker violates the Maxim of Quantity in the Gri- mately 1 -3 sentences, depending on length, with the aver- cean Cooperative Principle. In the case of 2a and 2b, the age sentence length of 15-20 words, (Cutts 2013) and the Speaker perceives a violation of the Gricean Cooperative Principle by the previous Speaker. In particular, the content Y = wav file length in minutes divided by (÷) the number of of the speaker’s utterance is not limited to the current topic “hot spot” signalized speech segments: in question but refers to the dialog itself, mostly functioning • Y < 10. as a comment. Specifically, 2a and 2b imply a violation of • Example: File length = 35 mins, SPEECH SEGM ENT - the Gricean Cooperative Principle in respect to the Maxim count: 5, Evaluation: 7. of Quality (“1. Do not say what you believe to be false”, “2. These benchmarks for dialogs with long speech segments Do not say that for which you lack adequate evidence”) can be referred to as “Tension” benchmarks with a value of (Grice, 1975) and/or in respect to the Maxim of Manner “Y” or “Tension (Y)”. (Submaxim 2. “Avoid ambiguity”) (Grice, 1975) in the ut- terance of the previous Speaker. In other words, in 2a and 2b, the Speaker considers the content of the previous Registering Cognitive Bias: Interactive Com- Speaker’s utterance to be unacceptable, ambiguous, false or parison of Speaker Turns controversial. The number of automatically signalized ”hot spots” indi- The registration of Cognitive Bias concerns the comparison cates the degree in which discussions and interviews con- of the actual content of the pair of utterances of Speaker 1 taining larger speech segments constitute dialog with many and Speaker 2 in the signalized “hot spots”. As stated above, points of tension and/or conflict. The average time of dis- the automatically signalized ”hot spots” are extracted to a cussions and interviews containing larger speech segments separate template for interactive processing, where the “hot in the Media is 30 to 45 minutes (30-45 mins). A typical spot” utterances of both speakers are compared. If the last example of a dialog with many detected points of possible 60 words (approximately 1 -3 sentences, with average sen- tension and/or conflict between speakers -participants is an tence length of 15-20 words, (Cutts 2013) of the first approximately 32 minute long interview with seven (7) reg- speaker’s utterance contain at least two of the above-de- istered “hot spots” (BBC (British Broadcasting Corpora- scribed features (1), (2) and (3), the Quantity and Manner tion): HARDtalk interview by journalist Stephen Sackur on Maxims of the Gricean Cooperative Principle (Grice, 1975) 16th of April 2018). One or both speakers’ utterances may are violated. Specifically, Cognitive Bias is registered by display two or more of features (1), (2) and (3). comparing content of the Speaker turns in the signalized “hotspots” and assigning the following respective values : Evaluation and Benchmarks • (a) Each “hot spot” is marked with a (1,1) if both speak- ers’ utterances are considered equally non-collaborative. The benchmark for evaluating a remarkable degree of ten- sion in a discussion is signalized by multiple “hotspots” de- • (b) If this is the case for one of the two speakers, in par- ticular, Speaker 1, the “hot spot” is marked with a (1,0) tected and not sporadic occurrences of “hotspots”. Thus, the for Speaker 1 (in this case, the journalist-reporter). In this number of 1-2 “hotspot” occurrences in longer speech seg- case, the style of question or statement uttered is not con- ments in question (30-45 mins) signalizes a low degree of sidered acceptable- contains features violating the Gri- tension. A remarkable degree of tension in a 30-45 minute cean Cooperative Principle - in respect to the Maxim of discussion or interview is related to a number of at least 4 Manner or the Maxim of Quantity (Irony) or in respect to detected “hotspots” (where the number of 3 hotspots consti- the Maxim of Quality (content is considered false (“F”). tutes a marginal value). Considering the above, the bench- • (c) If this is the case for Speaker 2, the “hot spot” is mark for evaluating a remarkable degree of tension concerns marked with a (0,1), if the interviewee’s (Speaker 2) re- the calculation of the time of discussion / interview in the action is not justified in respect to the style and content of Media (for example, 35 mins) and the number of signalized the utterance of Speaker 1. ”hot spots” (SPEECH SEGMENT-count) in Speaker turns. • (d) If a “hot spot” speech segment is evaluated by the User The defined benchmark for evaluating Speaker behavior is not as a point of possible tension and/or conflict between the number of minutes divided by the number of identified speakers-participants, the false “hot spot” is marked with speech segments signalized as “hot spots” which should a (0,0) for both Speakers. contain a single digit number, if the above-described mini- mal number of at least 4 detected “hotspots” is calculated. Evaluation and Benchmarks For example, the acceptable values are “8.75”, “7” or, ide- Both Speakers may have an equal number of a grading of ally, “5” (for a file of 35 minutes) versus “17.5” or “11.6” “1” in all extracted “hot spots” detected or one of the Speak- (for a file of 35 minutes). Interactions with Illocutionary ers may have a slightly higher/lower or a considerably Acts beyond the predefined framework of the dialog-discus- higher/lower grading of “1”. A grading of “1” in 50% or sion (with two speakers –participants and long speech seg- more of the “hot spots” signalizes that the Illocutionary Act ments) are based on the detected points of possible tension performed by the Speaker concerned is not restricted to “Ob- and/or conflict indicated by the following benchmark, where taining Information Asked” or “Providing Informatio n Asked”. Speaker behavior indicating that Illocutionary Acts Alexandris, C. 2010. English, German and the International “Semi- performed are not restricted to the predefined interaction professional” Translator: A M orphological Approach to Implied Connotative Features. Journal of Language and Translation, Sep- framework is evaluated by the following benchmarks, where tember 2010, Vol. 11, 2, Sejong University, Korea: 7- 46. Z = the number of “hot spot” signalized speech segments Austin J. L. 1962. How to Do Things with Words. Urmson. J.O. divided by (÷): 2 (50%): and Sbisà, M . eds., 2nd edition., 1976, Oxford, UK: Oxford Uni- • Sum of Speaker grades ≥ Z. versity Press. • Example: Evaluation of Speaker Behavior (Speaker 1 is Carlson, L.; M arcu, D. ; and Okurowski, M . E. 2001. Building a less collaborative than Speaker 2). Discourse-Tagged Corpus in the Framework of Rhetorical Struc- • SPEAKER1: (1), (1), (1), (0), (1). ture Theory. In Proceedings of the 2nd SIGDIAL Workshop on Dis- course and Dialogue, Eurospeech 2001, Denmark, September • SPEAKER2: (0), (0), (1), (1), (0). 2001, ACL Anthology, W01-16. • File length: 35 mins: SPEECH-SEGM ENT-count “hot Cohen, P. ; Johnston, M . ; M cGee, D. ; Oviatt, S. ; Pittman, J. ; spots”: 5 (sum of grades =6, 6 ≥ Z where Z = 2.5). Smith, I. ; Chen, L. ; and Clow, J. 1997. Quickset: M ultimodal These benchmarks for dialogs with long speech segments Interaction for Distributed Applications. In Proceedings of the 5th can be referred to as “Collaboration” benchmarks with a ACM International Multimedia Conference, 31-40, New York, NY: ACM Digital Library. value of “Z” or “Collaboration (Z)”. Cutts, M . 2013. Oxford Guide to Plain English. 4th edition., Ox- ford, UK: Oxford University Press. Conclusions and Further Research Du, J. ; Alexandris, C. ; M ourouzidis, D. ; Floros, V. ; and Iliakis, A. 2017. Controlling Interaction in M ultilingual Conversation Re- By-passing and registering Cognitive Bias in HCI systems visited: A Perspective for Services and Interviews in M andarin assisting in the evaluation of Human-Human interaction in- Chinese. In Kurosu, M . ed., Lecture Notes in Computer Science LNCS 10271: 573–583, Heidelberg, Germany: Springer. volves both automatic and interactive procedures. Interac- tive topic tracking in the dialog structure and automatic “hot Floros, V. and M ourouzidis, D. 2016. M ultiple Task M anagement in a Dialog System for Call Centers. M aster’s Thesis, Department spot” generation involving points of tension and/or conflict of Informatics and Telecommunications, National University of contribute to an evaluation of speakers -participants behavior Athens, Greece. and intentions during the interaction. Grice, H.P. 1975. Logic and conversation. In: Cole, P., M organ, J. The behavior and Cognitive Bias of (i) speakers -partici- eds., Syntax and Semantics, Vol. 3. Academic Press, New York. pants is evaluated in relation to the values of the “Relevance Hatim, B. 1997. Communication Across Cultures: Translation (X)”, “Tension (Y)” and “Collaboration (Z)” benchmarks. Theory and Contrastive Text Linguistics. Exeter, UK: University However, the same benchmarks may be used for evaluating of Exeter Press. the Cognitive Bias- Confidence Bias of (ii) the user-evalua- Hilbert, M . 2012. Toward a Synthesis of Cognitive Biases: How tor of the recorded and transcribed discussion or interview. Noisy Information Processing Can Bias Human Decision M aking. Psychological Bulletin, Vol 138(2), M ar 2012: 211-237. Spoken dialogs concerning complex interactions between speakers-participants are not limited to spoken journalistic Lewis, J.R. 2009. Introduction to Practical Speech User Interface Design for Interactive Voice Response Applications, IBM Soft- texts. As the variety and complexity of spoken HCI applica- ware Group, USA, Tutorial T09 presented at HCI 2009 San Diego, tions increases, Speech Acts performed by one or multiple CA, USA users-participants, even by the System itself, often may in- M a, J. 2010. A comparative analysis of the ambiguity resolution of volve Illocutionary Acts beyond the predefined framewo rk two English-Chinese M T approaches: RBM T and SM T. Dalian of a task-oriented dialog, especially in systems with emotion University of Technology Journal, 31(3): 114-119. recognition, virtual negotiation, psychological support or M arcu, D. 1999. Discourse trees are good indicators of importance decision-making. in text. In M ani, I. and M aybury, M . (eds), Advances in Automatic Text Summarization, Cambridge M A, The M IT Press: 123-136. Nass, C. and Brave, S. 2005. Wired for Speech: How Voice Acti- References vates and Advances the Human-Computer Relationship. Cam- bridge M A: The M IT Press. Alexandris, C. 2018. M easuring Cognitive Bias in Spoken Interac- Nottas, M . ; Alexandris, C. ; Tsopanoglou, A. ; and Bakamidis, S. tion and Conversation: Generating Visual Representations. In: Be- 2007. A Hybrid Approach to Dialog Input in the CitzenShield Di- yond M achine Intelligence: Understanding Cognitive Bias and Hu- alog System for Consumer Complaints. In Proceedings of HCII manity for Well-Being AI. In Proceedings from the AAAI Spring 2007, Beijing, Peoples Republic of China. Symposium, Stanford University, 204-206 Technical Report, SS- 18-03, Palo Alto, CA: AAAI Press. Paltridge, B. 2012. Discourse Analysis: An Introduction. London, UK: Bloomsbury Publishing. Alexandris, C. ; Nottas, M . ; and Cambourakis, G. 2015. Interac- tive Evaluation of Pragmatic Features in Spoken Journalistic Texts. Pan, Y. 2000. Politeness in Chinese Face-to-Face Interaction. Ad- In Kurosu, M . ed., Human-Computer Interaction, Users and Con- vances in Discourse Processes Series Vol. 67, Stamford, CT: texts, LNCS Lecture Notes in Computer Science, Vol. 9171: 259- Ablex Publishing Corporation. 268, Heidelberg, Germany: Springer. Sacks, H. ; Schegloff, E. A. ; and Jefferson, G. 1974. A simplest systematics for the organization of turn-taking for conversation. Language, Vol. 50: 696-735. Searle J. R. 1969. Speech Acts: An Essay in the Philosophy of Lan- guage. Cambridge, M A: Cambridge University Press. Taboada, M . 2006. Spontaneous and non-spontaneous turn-taking. Pragmatics, Vol.16 (2-3): 329-360. Stede, M . ; Taboada, M ., ; and Das, D. 2017. Annotation Guide- lines for Rhetorical Structure. M anuscript. University of Potsdam and Simon Fraser University. M arch 2017. Trofimova I. 2014. Observer Bias: An Interaction of Temperament Traits with Biases in the Semantic Perception of Lexical M aterial. PLoS ONE 9(1): e85677. Tung, T. ; Gomez, R. ; Kawahara, T. ; and M atsuyama, T. 2013. M ulti-party Human-M achine Interaction Using a Smart M ulti- modal Digital Signage. In Kurosu, M . ed., Human-Computer In- teraction. Interaction Modalities and Techniques, Lecture Notes in Computer Science, Vol. 8007, 2013, 408-415, Heidelberg, Ger- many: Springer. Wang, H. ; Gailliot, A. ; Hyden, D. ; and Lietzenmayer, R. 2013. A Knowledge Elicitation Study for Collaborative Dialogue Strate- gies Used to Handle Uncertainties in Speech Communication While Using GIS. In Kurosu, M . ed., Human-Computer Interac- tion, Lecture Notes in Computer Science 8007, 135-146, Heidel- berg, Germany: Springer. Wardhaugh, R. 1992. An Introduction to Sociolinguistics. 2nd edi- tion. Oxford, UK: Blackwell. Williams, J.D. ; Asadi, K. ; and Zweig, G. 2017. Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, July 30 - August 4, 2017, 665–677, Associa- tion for Computational Linguistics -ACL. Wilson, K. E. 2005. An oscillator model of the timing of turn- taking. Psychonomic Bulletin and Review 2005:12 (6): 957-968. Yang, Z. ; Levow G.A. ; and M eng F. H. 2012. Predicting User Satisfaction in Spoken Dialog System Evaluation With Collabora- tive Filtering. IEEE Journal of Selected Topics in Signal Pro- cessing, Vol. 6, Issue: 8, Dec. 2012: 971 – 981. Yu, Z. ; Yu, Z. ; Aoyama, H. ; Ozeki, M .; and Nakamura, Y. 2010. Capture, Recognition, and Visualization of Human Semantic Inter- actions in M eetings. In Proceedings of PerCom, Mannheim, Ger- many, 2010. Zeldes, A. 2016. "rstWeb - A Browser-based Annotation Interface for Rhetorical Structure Theory and Discourse Relations". In Pro- ceedings of NAACL-HLT 2016 System Demonstrations, San Diego, CA, 1-5, ACL Anthology, NAACL-HLT 2016.