=Paper=
{{Paper
|id=Vol-1419/paper0110
|storemode=property
|title=Multimodal Discourse: In Search of Units
|pdfUrl=https://ceur-ws.org/Vol-1419/paper0110.pdf
|volume=Vol-1419
|dblpUrl=https://dblp.org/rec/conf/eapcogsci/KibrikFN15
}}
==Multimodal Discourse: In Search of Units==
Multimodal Discourse: In Search of Units Andrej A. Kibrik (aakibrik@gmail.com) Institute of Linguistics RAS and Lomonosov Moscow State University B. Kislovskij per. 1, Moscow, 125009, Russia Olga V. Fedorova (olga.fedorova@msu.ru) Lomonosov Moscow State University, Russian Academy of National Economy and Public Administration, and Institute of Linguistics RAS Leninskie Gory 1, Moscow, 119899, Russia Julia V. Nikolaeva (julianikk@gmail.com) Lomonosov Moscow State University and Institute of Linguistics RAS Leninskie Gory 1, Moscow, 119899, Russia Abstract In this study we focus on three components of spoken Human communication is inherently multimodal. In this study discourse: the verbal component, prosody, and gesticulation. we focus on three channels of spoken discourse: the verbal These components can be viewed separately to an extent but component, prosody, and gesticulation. We address the they are all interwoven in natural communication. As any question of units that can be identified within these human behavior, multimodal discourse has structure. If so, components and in spoken multimodal discourse as a whole. what are its units? We discuss the basic units found within The basic unit of the verbal channel is the clause, reporting an the three channels considered separately (sections 2–4), and event or a state. A set of prosodic criteria help to define proceed with suggestions on coordinated basic units of elementary discourse units, that is prosodic units serving as quanta of discourse production. The gestural channel consists multimodal discourse (section 5). In section 6 we discuss of individual gestures, each defined by a set of features. larger, more complex units of spoken discourse, and offer Elementary discourse units are strongly coordinated with both conclusions in section 7. This study is based on a corpus of clauses and gestures and can thus be considered basic units of Russian discourse, but some English examples are cited multimodal discourse. Larger units can also be identified, below for the ease of exposition. such as prosodic sentences and series of gestures that again demonstrate coordination. By identifying units of natural discourse, coordinated across various channels, we make a 2. The verbal channel step towards multimodal linguistics. The verbal component of discourse largely consists of Keywords: discourse structure; multimodal discourse; clause; reporting events and states (Chafe, 1994). Languages have prosody; gesture; elementary discourse unit; sentence. developed a universal syntactic structure for packaging events and states: the clause. Each clause reports an event or 1. Introduction a state, along with their participants, or referents. For example, the minimal narrative Veni, vidi, vici consists of In modern linguistics, as well as in other domains of three events, each reported with a clause consisting of a cognitive science, there is a growing understanding that single word: a verbal predicate, encoding in its inflection the human communication is inherently multimodal. When we subject participant. Consider a natural spoken example communicate orally, we not only produce chains of words, (from text SBC032 of the Santa Barbara corpus of spoken but also intonate, gesticulate, interact with eye gaze, etc. American English, see www.linguistics.ucsb.edu/research/ (Gibbon et al. eds., 2000; Kress, 2002; Hugot, 2007; So et santa-barbara-corpus), consisting of two clauses, each al., 2009; Loehr, 2012; Ford, Fox, & Thompson, 2013; reporting an event: Goldin-Meadow, 2014, inter alia). A research program of multimodal linguistics is gradually evolving (Kibrik, 2010; And then I was forced out, Kress, 2010; Knight, 2011; Adolphs & Carter, 2013; Müller because I failed a promotion to commander! et al. eds., 2014) that treats the verbal structure on a par with non-verbal devices. Among non-verbal devices, sometimes Clauses may report events of various complexity and with only kinetic-visual behaviors are considered. But we find it various amount of detail, and they may include additional very important to identify prosody (see e.g. Kodzasov, elements, especially connectors indicating the semantic 2009), that is non-segmental aspects of the vocal signal, as a relationships between clauses, such as and then or because distinct communication channel. in the example above. In various theories of discourse Kibrik and Molchanova (2013) considered three structure (e.g. Mann & Thompson, 1988; Carlson, Marcu & communication channels employed in multimodal Okurowski, 2003; Wolf & Gibson, 2005) clauses are discourse: the verbal component, prosody, and kinetic- organized in a hierarchical network of nodes connected with visual behavior. They found that all three channels play an discourse-semantic relations. Groups of clauses are often important (and comparable) role in the overall process of organized into syntactic units known as sentences, with the conveying a message from a speaker to an addressee. 662 links between clauses being tight to various degrees, see e.g. (Kendon, 1986), are manual signs with fixed form and Givón, 2009; Laury & Ono, 2014. relatively fixed meaning, widely shared by a given linguistic community. Second, “illustrative” or “spontaneous” 3. The prosodic channel (McNeill, 1992) gesticulation, also called “co-speech” or Prosody directly encodes the dynamics of how thought “speech-associated” gestures, (for an overview, see Kendon, unfolds during discourse production. There is a set of 2004) consists of less conventional and more context- prosodic phenomena, including pausing, intonation sensitive gestures. Illustrative gestures are incomparably contours, tempo patterns, loudness patterns, and accent more common in natural discourse (Nikolaeva, 2013). It is placement, that converge in a unit of speech variously well established that illustrative gestures substantially dubbed syntagm (Shcherba, 1955), intonation unit (Chafe, participate in conveying a message from the speaker to the 1994), prosodic unit (e.g. Genetti & Slater, 2004), etc. We addressee (Cassell et al., 1999; Melinger & Levelt, 2004; prefer the term elementary discourse unit (EDU), see Kibrik Hostetter, 2011; Hall & Knapp eds., 2013). We posit the & Podlesskaya eds., 2009; Kibrik, 2011. EDUs are building following major kinds of illustrative gestures: depictive blocks, or quanta, of spoken discourse. They are coordinated (“iconic” + “metaphoric” in McNeill, 1992; “descriptive” in with breathing: one EDU is normally produced during an Kendon, 2004), metadiscursive (“pragmatic” in Payrató & exhalation, and boundary pauses coincide with an Tessendorf, 2014), pointing (“deictic” in McNeill, 1992), inhalation. EDUs are linguistic representations of successive and beats (“batons” in Efron, 1941/1972). cognitive states, termed foci of consciousness in Chafe, This study is primarily limited to depictive gestures, 1994. EDU identification in speech is a procedure based on because they are particularly frequent in our corpus (59%) expert assessment. Well trained transcribers of spoken and contribute semantically (either in a redundant or in a discourse strongly agree in EDU segmentation. complementary fashion) to the propositional content A remarkable fact about EDUs is their significant conveyed in the corresponding verbal component. Depictive correlation with clauses. In a number of studies of various gestures represent objects or act out events/states. Consider languages (Chafe, 1994 for English; Matsumoto, 2003 for two initial EDUs from ex. 3 in the Appendix. EDU #17 tam Japanese; Genetti & Slater, 2004 for Newari; Wouk, 2008 derevo ‘there is a tree’ is accompanied by the following for Sasak; Kibrik & Podlesskaya eds., 2009 for Russian, depictive gesture: the right hand palm faces down, fingers inter alia) the share of EDUs coinciding with clauses was are half curled and widely spaced, the right hand moves up found to vary between 50% and 70%. In the following in front of the speaker’s face, the left hand palm faces up at example (from the same text; see the chest level, with fingers half curled. EDU #18 k derevu spokencorpora.ru/showtranshelp.py for transcription prižata lestnica ‘to the tree a ladder is pressed’ is conventions) lines #12 and #14 are clausal EDUs, while line accompanied by two identical depictive gestures, the first of #13 is a parcellated adjunct semantically belonging to the which cooccurs with the initial pause, and the second with preceding clause but expressed with a subclausal EDU: the word lestnica ‘ladder’: the right hand faces the listener, fingers half curled, moves along a slanted line from the 00:22.9 12 ····(1.0) /My friend stood up /behind his \desk, center right and down, the left hand remains at the chest 00:26.0 13 ··(0.2) in his /\fu-ull \f-four \–stripes, level, faces up, with half curled fingers. Our dataset also 00:28.0 14 and \said: includes metadiscursive gestures (see ex. 4) that demonstrate more recurrent properties compared to the Properties of EDUs have clear parallels in goal-directed depictive gestures, but still are a lot more variable than the behavior of non-human mammals. The exploratory emblems. movement of rodents in a new environment is organized in We use the term gesture to refer to the basic unit of co- quanta (runs); runs are identified through initial acceleration speech gesticulation. Gesture is a communicatively and final deceleration, they are targeted at an significant manual movement, characterized by a unified informationally rich goal (analog of primary accent in pattern that includes trajectory, handshape and position, as discourse segments), they are separated by periods of well as other features. According to Kendon (1980, 2004) freezing, etc. (see e.g. Kafkafi et al., 2001, Cherepov & and McNeill (1992), the gestural structure includes units, Anokhin, 2008). These similarities suggest that the phrases, and phases. The gesture unit (G-unit) “begins the quantized structure of discourse and its specific prosodic moment the limb begins to move and ends when it has aspects have deep behavioral, neurocognitive, and reached a rest position again” (McNeill, 1992: 83). A G- evolutionary roots. phrase consists of the following phases: a non-obligatory preparation, a non-obligatory pre-stroke hold, an obligatory 4. The gestural channel stroke, a non-obligatory post-stroke hold, while a retraction (or recovery) is a part of G-unit (Kendon, 1980, 2004). In the human kinetic-visual behavior, manual gesticulation There can be one or more G-phrases in a G-unit. Our plays a particularly important role. There are two widely understanding of “gesture” is close to G-phrase, but unlike accepted polar kinds of manual gestures. First, “emblems” the latter a gesture may include (though not obligatorily) a (Efron, 1941/1972; Ekman & Friesen, 1969), also named “autonomous” (Kendon, 1983), or “quotable” gestures 663 retraction phase. In other words, a gesture ends either when 1994, Genetti & Slater, 2004, Kibrik, 2008, 2011). Spoken the rest position is resumed or when another gesture begins. sentence is established on the basis of prosodic criteria, such as target tone level (so-called period intonation), and 5. Coordination of basic units functions as a structural unit larger than an EDU but shorter A key issue in the research program of multimodal than an episode. Cognitively, in Chafe’s (1994: 148) terms, linguistics is the question of coordination between the a sentence is verbalization of a “superfocus of verbal, prosodic, and gestural channels. If we see discourse consсiousness”. Is there a correlate of prosodic sentence in as a fundamentally multimodal process, we need to identify the gestural channel? a unified basic unit of this process. A possible approach is to By default, co-speech gestures are independent of each select one of the already established units as the basic one. other. However, McNeill et al. (2001) discovered what can As has been shown in section 3, EDU is a good candidate be called gesture assimilation. Some gestures are organized for this role, particularly because of its close connection in series with repeated properties. McNeill et al. (2001) with the quanta of non-linguistic behavior. Also note that differentiate between the following two phenomena: prosody, serving as the source of criteria for EDU in so-called catchments, formal properties of gestures identification, is the ontogenetically earliest communication (such as location in space, handshape and trajectory, etc.) channel (see e.g. Crystal 1979, Blake 2000), preceding not may be repeated from one gesture to another, formal only segmental speech but also gesticulation. We already similarity conveying certain repeated semantic features; know that EDUs strongly correlate with clauses. How do in gesture inertia, formal properties are shared in a series EDUs relate to gestures? of gestures, but no semantic relatedness may be observed. We explored this question on the basis of 14 Russian retellings of the Pear Film (Chafe, 1980), videorecorded and Fig. 1 illustrates four gestures, two of which accompany transcribed. Transcription, including temporal dynamics, EDU #9 and two accompany EDUs #10–11 in example 1. pausing, annotation of EDUs, and other prosodic These gestures depict: phenomena, was done with the help of the PRAAT program Fig. 1a — the abundance of pears; (www.fon.hum.uva.nl/praat). Gesture annotation was done Fig. 1b — self-directed movement, putting pears into the in the ELAN program (www.lat-mpi.eu/tools/elan). A apron; requirement observed in this work was independent Fig. 1c — downward movement with the pears; annotation of clauses, EDUs, and gestures. The corpus Fig. 1d — outward movement of the pears, consists of 37 minutes of videorecording, 1232 EDUs, and corresponding to the verb vykladyval ‘was taking out’. 705 gestures (414 of which are depictive). The uniform hand configuration with the slightly curled We found that a prototypical EDU cooccurs with one fingers depicts pears in the gardener’s hands (Nikolaeva, depictive gesture, about 20% of EDUs cooccur with more 2013). This is an instance of catchment. than one gesture, see ex. 1: 91; ex. 3: 18, 19 in the Appendix. This reminds of the well-known generalization: “A general rule is one gesture, one clause <...> some clauses have more than one gesture and some gestures cover more than one clause” (McNeill, 1992: 94). Typically (approx. 90%), a depictive gesture falls within the temporal bounds of a single EDU. We also found that depictive gestures often (approx. 60%) cooccur with a a b c d whole EDU (ex. 3: 20, 21). When a gesture is shorter than the corresponding EDU, it is often temporally coordinated Figure 1. Catchment. with the later part of the EDU, that is the typical locus of Catchments as series of gestures are possible candidates rhematic information (ex. 3: 17, 18). We can thus specify for gestural correlates of prosodic sentences. Out data McNeill’s claim, positing not just the relatedness of gestures includes about 150 instances of catchments. They split into to the vocal part of a message, but also a high degree of two groups of equal size. In the first group, each gesture temporal coordination between gestures and EDUs. falls within the bounds of the corresponding EDU, and the boundaries of the gesture series coinсide with the 6. Coordination of larger units boundaries of the prosodic sentence, cf. ex. 3. These kinds EDU being the basic unit of talk, there are higher order of instances apparently support the coordination between units, too. In particular, in various languages spoken the prosodic and gesture units. In the second group, a correlates of written sentence have been found (Chafe, gesture series is coordinated with a certain part of a prosodic sentence (ex. 1; ex. 4). Looking into the second kind of instances more closely, it turns out that they mark the most 1 Here and below, the number after the colon refers to the EDU informationally rich parts of sentences (ex. 4: 75, 76), number within the given example. Examples are provided in the whereas some other EDUs of the sentence are accompanied Appendix. by independent gestures — ex. 4: 71 demonstrates two 664 metadiscursive gestures “palm up, open hand” illustrating Even though we are looking for structure and units in the process of information transfer (conduit metaphor). discourse, those should not be understood in the sense of Overall, catchments are coordinated with prosodic absolute discreteness. Units, or quanta, do exist, but the sentences. Given that catchments are a special case of G- boundaries between them are typically less than discrete. units (see section 4 above), we hypothesize that There are many instances of outliers and hybrids that coordination with prosodic sentences can be extended to G- complicate crisp and neat unit boundaries. As is shown by units in general. This latter point requires further Kibrik (2015), this property of discourse structure is investigation. common with other levels of language, as well as cognition Turning to gesture inertia, consider Fig. 2 that illustrates in general. Non-discrete effects abound both between three gestures, accompanying the three EDUs in example 2. syntagmatic units and between paradigmatic types. This These gestures depict: resonates with McNeill’s (2005) suggestion that gestures Fig. 2a — the sudden halt; may be classified into dimensions rather than discrete Fig. 2b — the falling bicycle; categories, and a given gesture may, for instance, combine Fig. 2c — the falling hat (a gesture similar in features of a depictive and a pointing gesture. configuration and trajectory to the previous one but with a larger amplitude). Acknowledgment In this case gesture assimilation is only formal, in contrast This study is supported by the Russian Science to catchments, in which similar gestures contain shared Foundation (grant #14-18-03819). semantic features. References Adolphs, S., & Carter, R. (2013). Spoken corpus linguistics: From monomodal to multimodal. N.-Y.: Routledge. Blake, J. (2000). Routes to Child Language: Evolutionary and Developmental Precursors. Cambridge: CUP. Carlson, L., Marcu, D., & Okurowski, M. E. (2003). Building a discourse-tagged corpus in the framework of a b c Rhetorical Structure Theory. In J. van Kuppevelt & Figure 2. Gesture inertia. R. Smith (Eds.), Current and new directions in discourse and dialogue. Dordrecht: Kluwer. In a first approximation, infrequent instances of gesture Cassell, J., McNeill, D., & McCullough, K. E. (1999). inertia appear to be coordinated with the unit of discourse Speech-gesture mismatches: Evidence for one underlying known as episode (van Dijk, 1981). We are not aware of representation of linguistic and non-linguistic robust methods of episode identification, either semantic or information. Pragmatics and Cognition, 7(1), 1–33. prosodic, so we have identified episodes intuitively. Chafe, W. (1994). Discourse, consciousness, and time. The Example 2 illustrates a typical situation, in which gesture flow and displacement of conscious experience in inertia is a series of gestures bridging a sentence boundary speaking and writing. Chicago: University of Chicago and joining a group of EDUs that qualifies as a small Press. episode. Chafe, W. (Ed.) (1980). The pear stories: Cognitive, cultural, and linguistic aspects of narrative production. 7. Conclusion Norwood, N.J.: Ablex. We have found that the basic units of the three channels of Cherepov, A., & Anokhin, K. (2008). Development of multimodal discourse — verbal, prosodic, and gestural — automatic analysis and recognition of mouse behavior by are coordinated between each other. More specifically, the segmentation and t-pattern method using video tracking. prosodically identified elementary discourse unit can be Proceedings of Measuring Behavior 2008 (pp. 253–254). shown to be coordinated with the verbal channel and with Maastricht, The Netherlands, August 26–29, 2008. the gestural channel. We have chosen the prosodic unit as Crystal. D. (1979). Prosodic development. In P.J. Fletcher & the central one because it is established on the basis of M.A. Garman (Eds.), Language acquisition (pp. 33–48). general behavioral criteria. Unlike gesture, prosody is Cambridge: CUP. (2nd edn., 1986, pp. 174–97.) always present in talk. In the studies reported in Kibrik & Efron, D. (1941/1972). Gestures, race and culture. The Molchanova, 2013 it turned out difficult to individually Hague: Mouton. separate the verbal channel, as talking inevitably involves Ekman, P., & Friesen, W. V. (1969). The repertoire of prosody. nonverbal behavior: Categories, origins, usage, and Apart from basic units, we have also discussed larger coding. Semiotica, 1, 49–98. units of spoken discourse. It appears that prosodically Ford, C. E., Thompson, S. A., & Drake, V. (2012). Bodily- identified sentences and episodes are coordinated with visual practices and turn continuation. Discourse gesture series known as catchment and inertia. Processes, 49(3-4), 192–212. 665 Ford, C. E., Fox, B., & Thompson, S. A. (2013). Units Slavic linguistics in a cognitive framework. N.Y.: Peter and/or action trajectories? The language of grammatical Lang. categories and the language of social action. In Kibrik, A. A. (2015). The problem of non-discreteness and B. Szczepek Reed & G. Raymond (Eds.), Units of talk – spoken discourse structure. Computational Linguistics Units of action. Amsterdam: Benjamins. and Intelligent Technologies, 14, vol. 1, 225–233. Genetti, C., & Slater, K. (2004). An analysis of syntax and Kibrik, A. A., & Podlesskaja, V. I. (Eds.) (2009). Rasskazy prosody interactions in a Dolakhā Newar: Rendition of o snovidenijax: Korpusnoe issledovanie ustnogo russkogo the Mahābhārata (with appendices and sound files). diskursa [Night Dream Stories: A corpus study of spoken Himalayan Linguistics, 3, 1–91. Russian discourse]. Moscow: Jazyki slavjanskix kul'tur. Gibbon, D., Mertins, I., & Moore, R. K. (Eds.) (2000). Kibrik, A. A., & Molchanova, N. B. (2013). Channels of Handbook of multimodal and spoken dialogue systems: multimodal communication: Relative contributions to Resources, terminology and product evaluation. Berlin: discourse understanding. In M. Knauff, M. Pauen, Springer. N. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the Givón, T. (2009). Multiple routes to clause union: The 35th Annual Conference of the Cognitive Science Society diachrony of complex verb phrases. In T. Givón & (pp. 2704–2709). Austin, TX: Cognitive Science Society. M. Shibatani (Eds.), Syntactic complexity: Diachrony, Knight, D. (2011). Multimodality and active listenership: A acquisition, neuro-cognition, evolution. Amsterdam: corpus approach. London: Bloomsbury. Benjamins. Kodzasov, S. V. (2009). Issledovanija v oblasti russkoj Goldin-Meadow, S. (2014). Widening the lens: What the prosodii [Studies in the field of Russian prosody]. manual modality reveals about language, learning, and Moscow: Jazyki slavjanskix kul’tur. cognition. Philosophical Transactions of the Royal Kress, G. (2002). The multimodal landscape of society, 369. communication. Medien Journal, 4, 4–19. Hall, J. A., & Knapp, M. L. (Eds.) (2013). Handbooks of Kress, G. (2010). Multimodality: A social semiotic communication science: Nonverbal communication. approach to communication. London: Routledge Falmer. Berlin: De Gruyter Mouton. Laury, R., & Ono, T. (2014). The limits of grammar: Clause Hostetter, A. B. (2011).When do gestures communicate? A combining in Finnish and Japanese conversation. meta-analysis. Psychological Bulletin, 137(2), 297–315. Pragmatics, 24(3), 561–592. Hugot, V. (2007). Eye gaze analysis in human-human Loehr, D. (2012). Temporal, structural, and pragmatic interactions. Master of science thesis. Stockholm, synchrony between intonation and gesture. Laboratory Sweden. Phonology, 3(1), 71–89. Kafkafi, N., Mayo, C. L., Drai, D., Golani, D., & Mann, W. C., & Thompson, S. A. (1988). Rhetorical Elmer, G. I. (2001). Natural segmentation of the structure theory: Toward a functional theory of text locomotor behavior of drug-induced rats in a photobeam organization. Text, 8(3), 243–281. cage. Journal of Neuroscience Methods, 109, 111–121. Matsumoto, K. (2003). Intonation units in Japanese Kendon, A. (1980). Gesticulation and speech: Two aspects conversation. Amsterdam: John Benjamins. of the process of utterance. In M. R. Key (Ed.), The McNeill, D. (1992). Hand and mind. Chicago: University of relation between verbal and nonverbal communication. Chicago Press. The Hague: Mouton. McNeill, D. (2005). Gesture and thought. Chicago: Kendon, A. (1983). Gesture and speech. How they interact. University of Chicago Press. In J. M. Wiemann & R. P. Harrison (Eds.), Nonverbal McNeill, D., Quek, F., McCullough, K.-E., Duncan, S., Interaction. Beverly Hills: Sage. Furuyama, N., Bryll, R., Ma, X.-F., & Ansari, R. (2001). Kendon, A. (1986). Some reasons for studying gesture. Catchments, prosody, and discourse. Gesture, 1, 9–33. Semiotica, 62, 3–28. Melinger, A., & Levelt, W. J. M. (2004). Gesture and the Kendon, A. (2004). Gesture. Visible action as utterance. communicative intention of the speaker. Gesture, 4, 119– Cambridge: Cambridge University Press. 141. Kibrik, A. A. (2008). Est’ li predloženie v ustnoj reči [Is Müller, C., Fricke, E., Cienki, A., McNeill, D. (Eds.) there a sentence in spoken speech]. In A. V. Arxipov, (2014). Body – Language – Communication. Berlin: L. V. Zaxarov, A. A. Kibrik et al. (Eds.), Fonetika i Mouton de Gruyter. nefonetika. K 70-letiju Sandro V. Kodzasova [Phonetics Nikolaeva, Ju. V. (2013). Illustrativnyje žesty v russkom and non-phonetics. Festschrift for 70 of Sandro V. diskurse [Gesticulation in Russian discourse]. Diss. cand. Kodzasov]. Moscow: Jazyki slavjanskix kul’tur. philol. science. Moscow, Russia. Kibrik, A. A. (2010). Mul’timodal’naja lingvistika Payrató, L., & Tessendorf, S. (2014). Pragmatic gestures. In [Multimodal linguistics]. In Yu. I. Aleksandrov, Müller, C., Fricke, E., Cienki, A., McNeill, D. (Eds.) V. D. Solov’jev (Eds.), Kognitivnyje issledovanija Body – Language – Communication. Berlin: Mouton de [Cognitive studies], IV. Moscow: Institute of psychology. Gruyter. Kibrik, A. A. (2011). Cognitive discourse analysis: Local discourse structure. In M. Grygiel and L. A. Janda (Eds.), 666 Shcherba, L. V. (1955). Fonetika francuzskogo jazyka van Dijk, T. (1981). Episodes as units of discourse analysis. [French phonetics]. Moscow: Izdatel'stvo literatury na In D. Tannen (Ed.), Analyzing discourse: Text and talk. inostrannyx jazykax. Georgetown: Georgetown University Press. So, W. C., Kita, S., & Goldin-Meadow, S. (2009). Using the Wolf, F., & Gibson, E. (2005). Representing discourse hands to identify who does what to whom: Gesture and coherence: A corpus-based study. Computational speech go hand-in-hand. Cognitive Science, 33, 115–125. Linguistics, 31(2), 249–287. Wouk, F. (2008). The syntax of intonation units in Sasak. Studies in Language, 32, 137–162. Appendix. Examples2 1 time, s EDU # Transcript gesture type 00:16 7 [···(0.5) u nego stojalo tri korziny] s grušami, depictive ‘[he had three baskets] with pears, 00:18 8 i on {[podnimalsja] na lestnicu, depictive and he {[was climbing up] the ladder, 00:20 9 [··(0.3) sobiral eti gruši] v [əə(0.3) fartuk], depictive, depictive [was collecting these pears] into [the apron], 00:22 10 [··(0.2) spu][skalsja depictive [was climbing] [down 00:23 11 i vykladyval]} eti gruši v korzinu. depictive and was taking out]} these pears into the basket.’ 2 time, s EDU # Transcript gesture type 00:59 29 <[···(0.8) i ɯɯɯ(0.8) ego velosiped] vre= vrezalsja v kamen'. depictive ‘<[and his bicycle] ran into a rock. 01:02 30 ··(0.4) [on] upal, depictive [he] fell down, 01:04 31 ···(0.7) [s nego sletela] šljapa.> depictive his hat fell off. (lit. [from him fell] the hat.>)’ 3 time, s EDU # Transcript gesture type 00:29 17 tam {[derevo], depictive ‘there is {[a tree], 00:30 18 [····(1.2)] k derevu prižata [lestnica], depictive, depictive [ ] to the tree [a ladder] is pressed, 00:32 19 [i vnizu lestnicy stojat] [tri korzinki], depictive, depictive [and under the ladder there are] [three baskets], 00:34 20 [dve iz kotoryx polnyje gruš], depictive [two of which are full of pears], 00:36 21 [a vtoraja pustaja].} depictive [and the second one is empty].}’ 4 time, s EDU # Transcript gesture type 02:24 71 ···(0.6) əəə(0.6) [əəə(0.7) əəə(0.6)] [əəə(0.8) i vdrug] pered nim ··(0.2) meta, meta okazyvajutsja ··(0.1) neskol’ko ··(0.1) parnej, ‘[ ] [and suddenly] in front of him show up a few guys, 02:28 72 ···(0.6) troe, three of them, 02:29 73 ···(0.5) niotkuda, from nowhere, 02:30 74 neponjatno otkuda vzjavšixsja, not clear where they are coming from, 02:31 75 i oni {[··(0.2) načinajut sobirat’ eti gruši], depictive and they {[begin picking up these pears], 02:33 76 i [pomogat’ emu skladyvat’]} v korzinu. depictive and [helping him put them]} into the basket.’ 2 Notation in examples: Dots followed by decimal numbers — absolute pauses and their length in seconds; əə(0.3) and ɯɯɯ(0.8) — plain and nasal filled pauses; symbol = indicates a truncated word; comma indicates a non-sentence final EDU, period a sentence-final EDU; square brackets indicate the boundaries of individual gestures, curly brackets — catchments, angle brackets — gesture inertia. 667