=Paper=
{{Paper
|id=Vol-1347/paper05
|storemode=property
|title=Phonotactic probabilities in Italian simplex and complex words: a fragment priming study
|pdfUrl=https://ceur-ws.org/Vol-1347/paper05.pdf
|volume=Vol-1347
|dblpUrl=https://dblp.org/rec/conf/networds/BraccoCC15
}}
==Phonotactic probabilities in Italian simplex and complex words: a fragment priming study==
Phonotactic probabilities in Italian simplex and complex words: a fragment priming study Giulia Bracco Basilio Calderone Chiara Celata Università di Salerno CNRS & Université de Toulouse II Scuola Normale Superiore Via Giovanni Paolo II 132 5 allées Antonio Machado P.zza dei Cavalieri 7 Fisciano (SA) Toulouse Pisa gcbracco@unisa.it basilio.calderone celata@sns.it @univ.tlse2.fr results of the study on simplex words only; we 1 Introduction however discuss the implications of the current findings for the processing of complex words. Phonotactics refers to the sequential organization of phonological units that are legal in a language 2 Experiment (Crystal 1992). However, legal sound sequences do not all occur with the same probability in a 2.1 Materials and procedure language. Phonotactic probability is most often Forty-two native Italian speakers participated in measured in terms of transitional probabilities a speeded lexical decision task in a fragment (TPs) of biphones and has been shown to influ- priming paradigm. Thirty bi- or tri-syllabic Ital- ence a large range of processes, including in- ian nouns containing a biphonemic consonant fants’ discrimination of native language sounds, cluster in internal position (e.g. borsa, ‘bag’) adults’ ratings of the wordlikeness of nonwords served as targets. Each target was primed by a (Vitevitch et al. 1997), speech segmentation (Pitt sequence corresponding to an initial fragment of & McQueen 1998, Mattys & Jusczyk 2001), the target (e.g. bor-borsa). The fragment prime word acquisition (Storkel 2001) and recognition could consist of 3 o 4 phonemes and always end- (Luce & Large 2001). Specifically, in the domain ed with the first consonant of the cluster. The of word recognition, high TPs facilitate word and average length ratio between prime and target nonword identification in speeded same-different was 0.49. The clusters were different across matching tasks, but slow down identification in words and each cluster could occur in only one lexical decision tasks due to the inhibitory effects target (although more than one fragment could of a large neighborhood (e.g. Vitevitch & Luce end in a given consonant). 12 were heterosyllabic 1999, Luce & Large 2001). Most of the studies (e.g. bor-sa ‘bag’), 12 tautosyllabic (e.g. deg- on the role of TPs in speech production and per- rado ‘decay’) and 6 ambisyllabic clusters (e.g. ception have been conducted on English. dis-tanza ‘distance’). In this paper we focus on the role of phonotac- Another set of 30 Italian nouns matching for tic probabilities in priming morphologically sim- average length, frequency and prime/target plex and complex words in Italian. We investi- length ratio, in which the fragment prime ended gate whether biphone TPs affect the recognition in a syllable onset consonant followed by a vow- of word targets after exposure to fragment el (e.g. tuc-tucano ‘toucan’). The same propor- primes differing in the probability with which the tion of fragment-final consonants was main- fragment-final consonant predicts the consecu- tained in the two sets of words. tive segment in the target. Sixty pseudowords matching for average We opted for a non-factorial, regression de- length and properties of the fragment were add- sign including lexical and sub-lexical frequency ed. Pseudowords were obtained by changing one and distributional variables as predictors (see letter of existing words (belonging to the same Baayen 2010). In this paper, we report on the frequency range of the experimental words), for Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org 24 1/3 in their initial part, 1/3 in their central part C of the cluster (‘SequenceTypeFreq’), (xi) the and 1/3 in their final part. The 30 clusters used cumulated frequency of the words in (x) (‘Se- for pseudowords did not appear in the words’ quenceTokenFreq’). list. In the lexical decision task, participants were 2.3 Analysis and results asked to press a button corresponding to their Fixed and mixed models with subject and prime dominant hand as soon as the orthographically as random variables were used. presented target was judged as a word, and a dif- For the purposes of the present study, we test- ferent button for targets judged as nonwords. All ed two different models, both including frequen- the stimuli appeared in Courier New font, 18 cy variables and phonotactic probability varia- point size in the center of the computer screen. In bles; they are shown in Table 1. The two models order to avoid allographic effects, primes were differed for the presence, in model II, of a meas- displayed in uppercase and targets in lowercase. ure of prime frequency, which was not included The fixation was 200 ms, followed by a 50 ms in model I, and for being focused either on se- pause. Primes appeared for 150 ms, followed by quence and bigram token frequencies (model I), a 50 ms pause. The targets remained on the com- or on sequence and bigram type frequencies. puter screen for a maximum of 1 sec. If the par- Both models were tested for CC items (e.g. bor- ticipants did not produce any answer within that sa, ‘bag’) and CV items (e.g. tuc-ano ‘ toucan’) time, the feedback Fuori tempo (‘Out of time’) separately. appeared on the screen. Reaction times (RTs) and the number of errors (Nerr) constituted the Model I Model II dependent variables. The reaction times were Fixed TargetFreq TargetFreq measured from target onset to subject’s response, effects LenghRatio PrimeTokenFreq SequenceTokenFreq LengthRatio and responses given after the deadline were BigramTokenFreq SequenceTypeFreq scored as errors. SequenceTP BigramTypeFreq The Experiment was preceded by a practice BigramTP SequenceTP session. When the participants reached the 70 % BigramTP of valid responses the experiment started. Random Subject Subject effects Fragment prime Fragment prime 2.2 Experimental variables Table 1. Fixed and random effects for the CC and CV Several statistical and distributional properties of items. word primes, targets and clusters were derived from the CoLFIS corpus (Bertinetto et al., 2005). The results of the fixed effects analyses for the For each prime-target pair, we calculated (i) relevant models are summarized in Table 2 (de- the token frequency of the target (‘TargetFreq’), pendent variable: RTs) and Table 3 (dependent (ii) the N of words beginning with the prime variable: Nerr). fragment (‘PrimeTypeFreq’), (iii) the cumulated According to model I, with RTs as the de- frequency of the words in (ii) (‘PrimeToken- pendent variable, the sequence’s TP (i.e., the TP Freq’), (iii) the length of the target (in N graph- between the fragment prime and the second con- emes), (iv) the length of the prime (in N graph- sonant of cluster) turned out to be the most sig- emes), (v) the prime/target length ratio. nificant predictor, even outranking the contribu- For each cluster, we calculated (vi) the TP tion of frequency values (for the target, the se- value, i.e. the probability with which the first quence and the bigram), which all concurred to consonant of the cluster predicts the occurrence the intercept. A different picture emerged how- of the following consonant, calculated over the ever for the CV items, for which no probability corpus word tokens (‘BigramTP’), (vii) the N of variables turned out to significantly predict the words containing the cluster (‘BigramType- subjects’ response times; on the contrary, the Freq’), (viii) the cumulated frequency of the target frequency, with the secondary contribution words in (vii) (‘BigramTokenFreq’), (ix) the TP of the frequency of the cluster, appeared to play a between the fragment prime and the second con- role for this subset of items. sonant of the cluster, e.g. P(s|bor) in borsa ‘bag’ According to model II, for CC items the role (‘SequenceTP’), (x) the N of words containing of the target frequency turned out to be very im- the sequence of the prime followed by the second portant, and the only additional effect was gener- 25 ated by the sequence’s TP. Thus the two models were similar in emphasizing the role of the prob- ability with which a given C follows the prime sequence. As for CV items, model II returned a picture very similar to the one that emerged in model I, with target frequency and bigram type frequency as the only significant predictors. Table 3. Fixed effects coefficients for the two models, CC and CV items (Nerr=dependent variable). 3 Discussion This work aimed to shed light on the role of TPs in a so far unstudied experimental environment, i.e., a lexical decision task with fragment prim- ing. As the large part of studies on phonotactic probabilities focused on English, this work also added to the field with evidence from a poorly Table 2. Fixed effects coefficients for the two models, investigated language, Italian. CC and CV items (RTs=dependent variable). Fragment priming is known to be modulated not only by word frequency and the frequencies When subject and prime were included as ran- of words matching the fragment but also by top- dom factors, the pairwise comparison in the like- down information conveyed by the prime: a lihood ratio test confirmed that the contribution fragment prime matching a unique morpho- of the sequence’s TP increased significantly the lexical family is as effective as a stem prime, predictability of the RTs patterns: χ2(1)= 11.184, thus showing that priming acts as a cue for the p= 0.0008 in model I, χ2 (1)= 5.4403, p= 0.019 in properties displayed in the target (see e.g. Lau- model II. danna & Bracco, 2006, for Italian). The average reaction times and the number of This study has shown that the priming effect errors were positively and significantly correlat- when an initial fragment is available is influ- ed, though with an intermediate correlation coef- enced also by bottom-up variables; in particular, ficient (r = .648, p < .01). We thus tested the two it depends on the probability with which the models with Nerr as the dependent variable, in segments composing the fragment or the frag- order to determine if the error rate was influ- ment-final consonant predict the occurrence of enced by frequencies and probabilities to a dif- the consecutive consonant. Although to a lesser ferent extent than response latencies. extent, the frequency with which bigrams and With Nerr as the dependent variable, R2 values sequences occur (as types or tokens) in the lexi- were consistently lower than in the RTs simula- con also predict the subjects’ behavior. Phono- tions (Table 3), thus indicating that the error pat- tactic probabilities thus turned out to predict the terns were accounted for by our frequency and subjects’ response to a large degree for many of probability variables to a more limited extent. In the phonological environments tested in the cur- particular, both model I and model II emphasized rent experiment, sometimes outperforming target for the CC items the role of target frequency as frequencies, and consistently overtaking the con- the only significant predictor of errors, while for tribution of the prime/target length ratio and of CV items an additional role of bigram frequen- the prime frequency. cies (by token and by type, respectively) was The results however suggested that the phono- found. Thus for the CV items, RTs and error rate tactic probabilities in the case of consonant clus- produced consistent results. ters were overall more important than in the case of consonant-vowel sequences; thus it must be 26 concluded that the contribution of TPs in lexical tial fragment and the second part of the word recognition is not the same across phonological (e.g. per-perdente ‘loser’). Together with the environments. Consonant clusters might play a current experiment, the experiment on prefixed particularly relevant role in lexical access, com- and pseudo-prefixed words will determine pared to CV sequences, as contemporary theories whether or not the role of TPs is different when based on the principles of phonological and mor- the target is a simplex word compared to when it phological naturalness also seems to predict (see is a prefixed word, and to when it is a pseudo- e.g. Dressler & Dziubalska-Kolaczyk, 2006; Ko- prefixed word. Different hypotheses may be put recky-Kroell et al. 2014). forward here, according to whether or not mor- Additionally, for CC sequence the token fre- phological boundaries affect the processing of quencies (of the bigram and of the prime + C consonant clusters (e.g., Calderone et al. 2014, sequence) turned out to be relatively more im- Celata et al. 2015 in press), and according to the portant than the corresponding type frequencies, likelihood that a given sequence occurs as mor- thus suggesting that the exposure to the number pheme or as homographic non-morphological of occurrence of a cluster or of a segment se- pattern (see Laudanna et al., 1994). quence may be more important in lexical access By describing phonotactic probability and fre- than the exposure to the individual items contain- quency effects during word recognition, this ing them. study offers arguments to models of lexical ac- An additional issue concerns the role of TPs in cess based on bottom-up processes such as co- morphologically complex words. According to hort models for orthographic stimuli (see e.g. some models, morphological parsing is necessary Johson & Pugh, 1994). The property of single for lexical access and the prefix (in the case of consonants to predict the following segment then prefixed words) has to be stripped away in order speeding up the recognition of the whole word, for the word to be recognized (from Taft & For- as an additional if not independent way to access ster, 1975 onwards). Assuming a condition in words and their subparts, might also be discussed which the fragment prime coincides with a pre- with reference to models that associate ortho- fix, TPs would play the additional role of mark- graphic input units to semantic and lexical ing the morphological boundary during the prim- knowledge (from connectionist models such as in ing event. According to the results of the current Harm & Seidenberg, 1999, to amorphous models study, it appears to be of utmost importance to such as in Baayen et al. 2011). further verify whether prefixed and pseudo- prefixed words behave in the same way. In fact, References models postulating morphologicl pre-parsing Harald R. Baayen. 2010. A real experiment is a facto- (e.g. Schreuder & Baayen, 1995) would suggest rial experiment? The Mental Lexicon, 5(1): 149- that high TPs will codetermine latencies for pre- 157. fixed targets only, while if morphology does not Harald R. Baayen, Petar Milin, Dusica Filipovic Dur- affect word recognition, then the TPs between devic, Peter Hendrix and Marco Marelli. 2011. An the fragment prime and the following segment amorphous model for morphological processing in composing the target will modulate latencies in visual comprehension on naive discriminative lear- prefixed and pseudo-prefixed words to the same ning. Psychological Review, 118: 438-482. extent. Pier Marco Bertinetto, Cristina Burani, Alessandro A follow-up experiment will therefore test the Laudanna, Lucia Marconi, Daniela Ratti, C. Ro- contribution of phonotactic statistical knowledge lando and Anna Maria Thornton. 2005. Corpus e in native speakers’ access to complex word Lessico di Frequenza dell’Italiano Scritto CoL- forms (specifically, prefixed nouns). Prefixed FIS). http://linguistica.sns.it/CoLFIS/Home.net and pseudo-prefixed words will be used for that Basilio Calderone, Chiara Celata, Katharina Korecky- purpose. In particular, fragment primes will be Kroell and Wolfgang U. Dressler. 2014. A compu- selected according to two different conditions: in tational approach to (mor)phonotactics: Evidence condition a) the targets are prefixed words and from German. Language Sciences, 46 (part A): 59- the fragment prime coincides with the prefix 70. (e.g. bis-bisnonna ‘grandmother’); in condition Chiara Celata, Katharina Korecky-Kroell, Irene Ricci, b) the targets are pseudo-prefixed words and no and Wolfgang U. Dressler. 2015 (in press). Online morphological boundary occurs between the ini- processing of German (mor)phonotactic clusters by 27 adults and adolescents. Italian Journal of Lingui- Michael Vitevitch, Paul A. Luce, David B. Pisoni and stics, 27(1). Edward T. Auer. 1999. Phonotactics, neighborhood activation and lexical access for spoken words. Wolfgang U. Dressler and Katarzyna Dziubalska- Brain and Language, 68: 306-311. Kolacyk. 2006. Proposing Morphonotactics. Italian Journal of Linguistics, 18: 249-266. Katharina Korecky-Kroell, Wolfgang U. Dressler, Eva Maria Freiberger, Eva Reinisch, Karlheinz Moerth and Gary Libben. 2014. Phonotactic and morphonotactic processing in German-speaking adults. Language Sciences, 46 (part A): 48-58. N.F. Johnson and K.R. Pugh. 1994. A cohort model of visual word recognition. Cognitive Psychology, 26: 240-346. Alessandro Laudanna, Cristina Burani and Antonella Cermele. 1994. Prefixes as processing units. Lan- guage and Cognitive Processes, 9, 295-316. Alessandro Laudanna and Giulia Bracco. 2006. Stem and fragment priming on verbal forms of Italian. In Proceedings of the 5th International Conference on the Mental Lexicon (Montreal, Canada, 11-13 Oc- tober, 2006): 26. Paul A. Luce and Nathan R. Large. 2001. Phonotac- tics, density, and entropy in spoken word recogni- tion. Language and Cognitive Processes, 16: 565- 581. Sven L. Mattys and Peter W. Jusczyk. 2001. Phono- tactic cues for segmentation of fluent speech by in- fants. Cognition, 78: 91-121. Mark Pitt and James McQueen. 1998. Is compensa- tion for coarticulation mediated by the lexicon? Journal of Memory and Language, 39: 347-370. Robert Schreuder and Harald R. Baayen. 1997. How simplex complex words can be. Journal of Memory and Language, 37: 118-139. Holly L. Storkel. 2001. Learning nonwords: Phono- tactic probabilities in language development. Jour- nal of Speech, Language, and Hearing Research, 44: 1321–1337 Marcus Taft and Kenneth I. Forster. 1975. Lexical storage and retrieval of prefixed words. Journal of Verbal Learning and Verbal Behavior, 14: 638- 647. Michael Vitevitch, Paul Luce, J. Charles-Luce and D. Kemmerer. 1997. Phonotactics and syllable stress: Implications for the processing of spoken nonsense words. Language and Speech, 40: 47–62. Michael S. Vitevitch and Paul A. Luce. 1999. Proba- bilistic phonotactics and neighborhood activation in spoken word recognition. Journal of Memory & Language, 40: 374-408. 28