Entrenchment Determinants Are Relevant throughout Spoken Language Production: A Study of Speech Errors Svetlana Gorokhova (s.gorokhova@spbu.ru) Department of Philology, Saint Petersburg State University, 11 Universitetskaya nab. Saint Petersburg, 199034 Russia Abstract retrieval) is also affected by word frequency (Gorokhova, 2013; Kittredge et al., 2008; Navarrete et al., 2006). Spontaneously produced Russian speech errors were analyzed for factors that may be expected to determine cognitive Furthermore, recent studies provide evidence for the entrenchment. The results suggest that entrenchment availability of probabilistic information about individual determinants are relevant throughout the process of spoken inflectional variants of a word in lexical memory (see language production, including phonological encoding, lemma Baayen et al., 2003; Fleischhauer & Clahsen, 2012; retrieval, selection of inflected word forms, and grammatical Gorokhova, 2011; Smolka, Zwitserlood, & Rösler, 2007). agreement computation, and that the degree of entrenchment Similarly, a number of experimental studies propose that of linguistic units is predictive of speech errors. AoA is only relevant at the stage of phonological retrieval Keywords: entrenchment; word frequency; age of (Barry et al., 2001; Kittredge et al., 2008) although other acquisition; word length; associative relatedness; authors argue that both frequency and AoA are fundamental cooccurrence strength; phonological substitutions; semantic for lexical retrieval (Catling et al., 2010; Meschyan & substitutions; inflected word forms; grammatical agreement Hernandez, 2002), including its earlier stages (Brysbaert, Van Wijnendaele, & De Deyne, 2000). Introduction Spreading-activation theories of production posit that whole Data networks of units are activated during language production, To explore the effect of entrenchment predictors on and the selected item is the one that receives the highest spoken language production, spontaneously produced proportion of activation (Dell, 1986; McNamara, 1992; Russian speech errors (slips of the tongue) of phonological, Stemberger, 1985). The degree to which a unit is entrenched semantic, and syntactic types were analyzed for factors that in long-term memory may be assumed to affect the amount may be regarded as determinants of entrenchment — of activation that it receives, and consequently strongly frequency, age of acquisition (AoA), target-error co- entrenched units have a better chance of being selected occurrence strength, word association norms, and word (Schmid, 2007). length. The analyses used the data from the Russian Usage-based cognitive-linguistic theories claim that the National Corpus, Russian Word Association Thesaurus, and frequency of use of a linguistic unit correlates with its experimentally obtained AoA ratings for target and error degree of cognitive entrenchment (Evans & Green, 2006; words. The study involved 657 context-free sound-based Hudson, 2007; Langacker, 1987; Rosch et al., 1976; noun substitution errors, 1378 context-free meaning-based Schmid, 2007) and that frequency is thus a major noun substitution errors, 242 context-free errors that entrenchment predictor. Importantly, entrenchment of resulted in the selection of a wrong inflectional variant of linguistic units is arguably determined not only by the the target word, and 274 agreement errors in modifier-head frequency of their activation by individual speakers but also constructions. The errors were collected by tape recording by the frequency of their occurrence in a speech community and digitally recording everyday conversations, telephone as a whole. conversations, and live TV and radio programs. In case the It has been suggested that apart from frequency, speaker themselves did not catch the error and did not entrenchment is likely to be determined by the age of correct it, where possible, they were questioned as to what acquisition (AoA) of a linguistic unit (Gerhand and Barry, they had intended to say. If this was not possible (e.g. in the 1999; Ghyselinck, Lewis, & Brysbaert, 2004; Morrison and case of an error produced by a participant of a talk show), Ellis, 1995, 2000) and the length of a linguistic string the error was only included in the corpus after being attested (Blumenthal-Drame, 2012). by two professional linguists. Although frequency and AoA are important explanatory mechanisms used by cognitive linguistic and Results and Discussion psycholinguistic theories, the available experimental data on the role and the locus of the frequency and AoA effects in lexical retrieval are controversial. While many authors argue Sound-based substitution errors that the frequency effect is located at the stage of accessing Phonological substitution errors were analyzed for word phonological forms (Jescheniak and Levelt, 1994; Jurafsky, frequency and word length. Not surprisingly, the results of 2003 etc.) rather than the semantic lemma level, there is the length comparison did not reveal any target vs. error some evidence to suggest that lexical selection (lemma word length difference as target and errors words in phonological substitution errors are known to frequently 488 have similar beginning segments and equal lengths. Examples 1. Da, kstati, mne pyatno [pitnO] otčistili → ... pis'mo [pis′mO]... By the way, I've had the spot cleaned → ... letter ... 2. Nas vstretili s bol′šim vnimanijem [vnimAnijəm] →…vlijanijem [vlijAnijəm] They met us with great care → …influence At the same time, the results of the frequency analysis show that lower-frequency nouns tend to be replaced by phonologically related higher-frequency nouns Figure 2: Target vs. error log-transformed frequencies (t(656)=3.41, p<0.001) (fig. 1). (Mean and SEM): Cohyponym substitution errors. Figure 1: Target vs. error log-transformed frequencies (Mean and SEM): Phonological substitution errors. Figure 3: Target vs. error log-transformed frequencies (Mean and SEM): “Associatively related” substitution Meaning-based substitution errors errors. Semantic substitution errors were analyzed for word frequency, word length, age of acquisition (AoA), target- error cooccurrence strength, and word association norms. Based on the types of conceptual-semantic relationships between the target and its substitute, the target-error pairs of nouns were classified as either “cohyponyms” (e. g. saucer → plate) or “antonyms” (e. g. descendants → ancestors) or “associatively related” (e. g. carpets → floors) by 20 undergraduate students of linguistics from St Petersburg State University and by 4 professional linguists. The resulting error corpus under study comprised 724 cohyponym, 187 antonym, and 467 “associatively related” Figure 4: Target vs. error log-transformed frequencies target-error pairs. (Mean and SEM): Antonym substitution errors. Examples 1. Cohyponyms Besides, there is a very significant positive correlation Ja tebe bljudce, meždu pročim, xoču dostat′ → … tarelku between target and error frequency values in cohyponym Incidentally, I want to get you a saucer → … a plate (r=0.74, p<0.0001) and “associatively related” (r=0.60, 2. Antonyms p<0.0001) error types (figs. 5 and 6). A potom, predstavljaete, naši potomki obnaružat etu knigu → ... predki ... And then, can you imagine, our descendants will discover this book → ... ancestors ... 3. "Associatively related" Koška vse vremja deret kovry → …poly The cat keeps tearing the carpets → … floors Word frequency The results indicate that error word frequencies tend to exceed target word frequencies in cohyponym (t(723)=2.49, p<0.01) and “associatively related” (t(466)=3.91, p<0.0001) semantic substitutions but Figure 5: Correlation of target and error word frequencies: not in antonym substitutions (figs. 2, 3, and 4). Cohyponym substitution errors. 489 Figure 8: Target vs. error age of acquisition Figure 6: Correlation of target and error word frequencies: (Mean and SEM): “Associatively related” substitution “Associatively related” substitution errors. errors. Age of acquisition AoA ratings for target and error words were obtained using the experimental procedure described in Kuperman, Stadthagen-Gonzales, & Brysbaert, 2012. The target and error words were distributed over lists of 250 words each. For every word, participants were asked to enter the age (in years) at which they thought they had learned the word, i.e. the age at which they would have understood the word even if they did not use it actively at the time. The lists were initially presented to 20 participants each. If a word got less than 18 valid observations after this phase Figure 9: Target vs. error age of acquisition because of some values missing in the completed lists, it (Mean and SEM): Antonym substitution errors was included in a new, comparable list at the end of the data collection and presented to new participants until the Word length Word length measured in syllables, while not required number of observations was reached. The ratings affecting cohyponym and antonym errors, may still be were collected from 256 respondents, who were all native predictive of the outcome of “associatively related” speakers of Russian with college education aged between 21 semantic substitutions, in which target words were found to and 76. be significantly longer than error words (t(466)=2.49, The results suggest that error words tend to be acquired p<0.01) (fig. 10). earlier than target words in cohyponym (t(717)=2.94, p<0.01) and “associatively related” (t(446)=2.79, p<0.01) but not in antonym substitutions (figs. 7, 8, and 9). Figure 10: Target vs. error log-transformed word length (syllables) (Mean and SEM): “Associatively related” substitution errors. Figure 7: Target vs. error age of acquisition (Mean and SEM): Cohyponym substitution errors. Target-error associative relatedness Target-error pairs were analyzed in terms of word association norms from the Russian Word Association Thesaurus. For the purpose of this study, a target word was taken to be a stimulus word, and the substitute word, to be its associative response. The results indicate that both antonym and cohyponym errors tend to have much higher measures of target-error associative relatedness compared to “associatively related” 490 errors while antonym target-error pairs are more closely Substitutions of inflected word forms related than cohyponyms (F(2, 1336)=17.71, p<0.0001) The analyses, based on the frequency data from the (fig. 11). spoken part of the Russian National Corpus, involved context-free substitutions of inflected word forms that resulted in the selection of a wrong inflectional variant of a noun, pronoun, verb, or adjective. Examples 1. Case (DAT → GEN) Ty otvezi ix v Moskvu k RODSTVENNIK-AM you take them to Moscow to relative-PL.DAT → … k RODSTVENNIK-OV to relative-PL.GEN Why don't you take them to your relatives in Moscow. 2. Person (2d → 3d) Figure 11: Target-error associative relatedness Poslezavtra BUD-EŠ ′ otdoxnuvšij (Mean and SEM): “Associatively related” vs. cohyponym day after tomorrow be-2SG.FUT well-rested vs. antonym substitution errors. →… BUD-ET … be-3SG.FUT Target-error cooccurrence strength The Russian National You’ll feel well-rested tomorrow. Corpus was used to estimate the mutual informativeness, or 3. Tense (FUT → PST) co-occurrence strength, of the target and its substitute. Since Ja vynuždena BUDU vyslušat′ plamennuju tiradu Mutual Information (MI) score is known to overestimate I have to be:3SG.FUT listen to fiery tirade low-frequency words, T-score was used in addition to MI → … BYLA … because it highlights the word pairs whose co-occurrence be:3SG.PST frequency is high enough to be reliable. MI and T-scores I’ll have to listen to a fiery tirade. were computed for each target-error pair with a context The results indicate, firstly, that token frequency is window of + 10 (the average length of a Russian language relevant to the selection of inflected word forms. A sentence). comparison between the frequencies of the target and error Both antonym and cohyponym errors appear to have inflected word forms in the corpus reveals that the general much higher measures of target-error cooccurrence strength tendency is for a higher-frequency inflectional variant of a compared to “associatively related” errors (F(2, 1372)=9.85, word to substitute for a lower-frequency variant p<0.0001) (fig. 12). (t(241)=3.78, p<0.001) (fig. 13). Figure 12: Target-error cooccurrence strength (Mutual Figure 13: Target vs. error log-transformed frequencies Information score) (Mean and SEM): “Associatively (Mean and SEM): Substitutions of inflected word forms. related” vs. cohyponym vs. antonym substitution errors Thus, Russian speech error data corroborate the claim that The findings are in line with the view that lexical retrieval the production of inflected forms may be influenced by is affected by word frequency (Brysbaert et al., 2000; word form frequency (Baayen et al., 2003; Clahsen, Hadler, Navarrete et al., 2006) and AoA (Catling et al., 2010; & Weyerts, 2004; Fleischhauer & Clahsen, 2012). Meschyan & Hernandez, 2002) and provide support for the Furthermore, it appears that the selection of inflected hypothesis that various determinants of entrenchment may forms is sensitive to type frequency. A comparison play a role throughout the process of lexical selection between the relative frequencies of the target and error including its earlier stages such as the stage of lemma inflectional variants within the word’s declension paradigm retrieval. suggests that the case forms of nouns and personal pronouns that occur most frequently in spoken Russian (nominative, genitive, and accusative) tend to substitute for the less 491 frequent oblique case forms such as the dative whereas the frequency difference; so it seems plausible to conclude that higher-frequency nominative and accusative forms tend to it is the higher frequencies of error modifier-head replace the genitive (t(86)=4.03, p<0.001) (fig. 14). constructions that account for the result. Figure 14: Target vs. error relative frequencies (per cent) Figure 15: Target vs. error construction log-transformed (Mean and SEM): frequencies (Mean and SEM): Agreement errors in Substitutions of noun/personal pronoun case forms. modifier-head [Adj/Part/Pron/Num+N] constructions The results are evident in favor of usage-based models of Evidence from agreement errors in modifier-head mental grammar suggesting that the degree of entrenchment, constructions indicates that the production mechanism regarded as a mental correlate of usage frequency, may makes use of distributional patterns of relevant influence the selection of a word's inflectional variants. constructions stored in long-term memory. This finding is in Furthermore, the data provide support for the claim that the line with some recent studies that suggest that number frequency of use of grammatical constructions at different agreement may be computed based on the speaker’s levels of schematicity is an important determinant of linguistic experience (Haskell, Thornton, & MacDonald, linguistic structure and language use (see Bybee, 2006; 2010; Thornton & MacDonald, 2003). The error Croft & Cruse, 2004; Diessel, 2007). construction seems to be a well-entrenched recurrent pattern, which a speaker, based on their linguistic Errors of agreement in modifier-head constructions experience, tends to use as a default schema instead of using The analysis involved “reversed agreement” errors in more generalized constructional schemas. modifier-head [Adj/Part/Pron/Num+N] constructions, when a speaker selects an irrelevant noun case form based on the Conclusion case-ambiguous pre-modifier adjective form instead of Speech error data provide supportive evidence for the computing the adjective case form based on the head noun claim that properties of linguistic units regarded as form. entrenchment determinants are important throughout the Examples process of spoken language production. Furthermore, the 1. PL.LOC → PL.GEN results suggest that the degree of entrenchment of words and na et-IX forum-AX lexico-grammatical structures may be a factor involved in at this-PL.GEN/LOC forum-PL.LOC the occurrence of speech errors. At the same time, more → research is needed to investigate the effects of other na et-IX forum-OV variables such as imageability on the word’s susceptibility at this-PL.GEN /LOC forum-PL.GEN to errors in normal speech. (I visited different Internet forums and) at these forums… 2. SG.F.GEN → SG.F.DAT Acknowledgments mil-OJ ženščin-Y This work was supported by RBRF grant No. 13-06-00353. nice-SG.F.GEN/DAT/INS/LOC woman-SG.F.GEN → mil-OJ ženščin-E References nice-SG.F.GEN/DAT/INS/LOC woman-SG.F.DAT Baayen, H., McQueen, J., Dijkstra, T., & Schreuder, R. (I visited a presentation made by a) nice woman… (2003). Frequency effects in regular inflectional The errors were analyzed using frequency data from the morphology: Revisiting Dutch plurals. In: H. Baayen & disambiguated part of the Russian National Corpus. The R. Schreuder (Eds.), Morphological Structure in comparison reveals the tendency for speakers to substitute Language Processing, Berlin: Mouton de Gruyter. more frequent constructions for less frequent constructions Blumenthal-Dramé, A. (2012). Entrenchment in Usage- (t(236)=3.49, p<0.001) (fig. 15). To ascertain that the result Based Theories: What corpus data do and do not reveal was not due to higher frequencies of error head nouns, a about the mind. Berlin: Walter de Gruyter. statistical test was run to compare target and error head Brysbaert, M., Van Wijnendaele, I., & De Deyne, S. (2000). noun frequencies, which did not produce a significant Age-of-acquisition effects in semantic processing tasks. 492 Acta Psychologica, 104, 215-226. syntactic information and of phonological form. Journal Bybee, J.L. (2006). Frequency of use and the organization of Experimental psychology: Learning, Memory, and of language. Oxford: Oxford University Press. Cognition, 20(4), 824-843. Catling, J.C., Dent, K., Johnston, R.A., & Balding, R. Jurafsky, D. (2003). Probabilistic modeling in (2010). Age of acquisition, word frequency, and picture- psycholinguistics: Linguistic comprehension and word interference. Quarterly Journal of Experimental production. In R. Bod, J. Hay, & S. Jannedy (Eds.), Psychology, 63(7), 1304-1317. Probabilistic Linguistics. Cambridge, MA: MIT Press. Clahsen, H., Hadler, M., & Weyerts, H. (2004). Speeded Kittredge, A.K., Dell, G. S., Verkuilen, J., & Schwartz, M. production of inflected words in children and adults. F. (2008). Where is the effect of frequency in word Journal of Child Language, 31, 683-712. production? Insights from aphasic picture-naming errors. Croft, W. & Cruse, D.A. (2004). Cognitive Linguistics. Cognitive Neuropsychology, 25, 463–492. Cambridge: Cambridge University Press Kuperman, V., Stadthagen-Gonzales, H, & Brysbaert, M. Croft, W. & Cruse, D.A. (2004). Cognitive Linguistics. (2012). Age-of-acquisition ratings for 30,000 English Cambridge: Cambridge University Press. words. Behavior Research Methods, 44, 978-990. Dell, G. S. (1986). A spreading-activation theory of retrieval Langacker R. W. (1987). Foundations of cognitive in sentence production. Psychological Review, 93(3), grammar. Vol. 1, Theoretical Prerequisites. Stanford, 283–321. CA: Stanford University Press. Diessel, H. (2007). Frequency effects in language McNamara, T. P. (1992). Theories of priming: I. acquisition, language use, and diachronic change. New Associative distance and lag. Journal of Experimental Ideas in Psychology, 25, 108–127. Psychology: Learning, Memory, & Cognition, 18, 1173– Evans, V. & Green, M. (2006). Cognitive Linguistics: An 1190. Introduction. Edinburg: Edinburgh University Press. Meschyan, G., & Hernandez, A. (2002). Age of acquisition Fleischhauer, E., & Clahsen, H. (2012). Generating and word frequency: Determinants of object-naming inflected word forms in real time: Evaluating the role of speed and accuracy. Memory & Cognition, 30 (2), 262- age, frequency, and working memory. In A. Biller, E. 269. Chung, & A. Kimball (Eds.), Proceedings of the 36th Morrison, C. M., & Ellis, A. W. (1995). The roles of word annual Boston University Conference on Language frequency and age of acquisition in word naming and Development. Vol. 1. Somerville, MA: Cascadilla Press. lexical decision. Journal of Experimental Psychology: Gerhand, S., & Barry, C. (1999). Age of acquisition, Learning, Memory and Cognition, 21, 116-133. frequency and the role of phonology in the lexical Morrison, C. M., & Ellis, A. W. (2000). Real age of decision task. Memory and Cognition, 27, 592-602. acquisition effects in word naming and lexical decision. Ghyselinck, M., Lewis, M. B., & Brysbaert, M. (2004). Age British Journal of Psychology, 91, 167-180. of acquisition and the cumulative-frequency hypothesis: Navarette, E., Basagni, B., Alario, F.-X., & Costa, A. A review of the literature and a new multi-task (2006). Does word frequency affect lexical selection in investigation. Acta Psychologica, 115, 43–67. speech production? The Quarterly Journal of Gorokhova, S. (2011). The role of frequency effects in the Experimental Psychology, 59, 1681–1690. selection of inflected word forms: A corpus study of Rosch, E., Mervis, C. B., Gray, W. D., Johnson, D. M., & Russian speech errors. In M. Konopka, J. Kubczak, C. Boyes-Braem, P. (1976). Basic objects in natural Mair, F. Šticha, & U. H. Wassner (Eds.), Grammar and categories. Cognitive Psychology, 8, 382–439. Corpora 2009: Corpus Linguistics and Interdisciplinary Schmid, H.-J. (2007). Entrenchment, salience, and basic Perspectives on Language, Vol. 1. Tübingen: Gunter Narr levels. In D. Geeraerts & H.Cuyckens (Eds.), The Oxford Verlag. Handbook of Cognitive Linguistics. New York: Oxford Gorokhova, S. (2013). Some factors that determine the University Press. outcome of lexical competition in language production: A Smolka, E., Zwitserlood, P., & Rösler, F. (2007). Stem corpus-based analysis of Russian speech errors. In A. access in regular and irregular inflection: Evidence from Stefanowitsch & J. Goschler (Eds.), Yearbook of the German participles. Journal of Memory and Language, German Cognitive Linguistics Association, 2013. Vol. 1. 57, 325-347. Berlin/Boston: Mouton de Gruyter. Stemberger, J. P. (1985). An interactive activation model of Haskell, T.R., Thornton, R., & MacDonald, M.C. (2010). language production. In A. Ellis (Ed.), Progress in the Experience and grammatical agreement: Statistical psychology of language. Vol. 1. London: Erlbaum. learning shapes number agreement production. Cognition Thornton, R., & MacDonald, M. C. (2003). Plausibility and 114: 151–164. grammatical agreement. Journal of Memory and Hudson, R. (2007). Word grammar. In D. Geeraerts & H. Language, 48, 740-759. Cuyckens (Eds.), The Oxford Handbook of Cognitive Linguistics. New York: Oxford University Press. Jescheniak, J. D., & Levelt, W. J. M. (1994). Word frequency effects in speech production: Retrieval of 493