1 Introduction

Effects of processing complexity in perception and production. The case of English comparative alternation

Gero Kunter English Language

gero.kunter@uni-duesseldorf.de 0 1

Linguistics

0 1 0 Copyright c by the paper's authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference , Pisa 1 Heinrich-Heine-Universita ̈t D u ̈sseldorf

32 36

This paper discusses the effect of processing complexity on the English comparative alternation. The reported experiments show a processing advantage of the synthetic comparative in perception, but a preference of the analytic comparative in sentence production if the base adjective is cognitively complex. These results imply that perceptual complexity and complexity in production have diverging effects on the English comparative alternation. More generally, the paper calls for a fine-grained look at the role of processing complexity in areas of morphosyntactic variation.

1 Introduction

Most English comparatives are formed using either a synthetic form (e.g. easier) or an analytic form (e.g. more important). While most adjectives clearly prefer either the synthetic or the analytic comparative, there is a considerable number of adjectives which frequently take both forms, e.g. more friendly vs. friendlier. The decision for either form is influenced by several phonological, morphological, syntactic and semantic factors. For example, the probability of analytic comparatives increases with the number of morphemes in the adjective base. It is also higher if the comparative is in predicative than in attributive position, and it decreases with an increasing comparative/positive ratio (see Szmrecsanyi 2005, Hilpert 2008 and Mondorf 2009 for detailed discussions) .

Mondorf (2009) argues that these factors are all part of a more general, audience-oriented compensatory mechanism called more-support: if the cognitive complexity of the adjectival base or its environment increases, speakers prefer the analytic comparatives, because they have a processing advantage over the corresponding synthetic form. For instance, an adjective that is morphologically complex is assumed to be also cognitively more complex than a simplex adjectives, and in order to compensate for this increased cognitive complexity, speakers may prefer the analytic comparative over the synthetic alternative.

Yet, there is only little psycholinguistic research that investigated this assumed processing advantage of analytic forms. A notable exception is Boyd (2007, ch. 2) who conducted a self-paced reading experiment to investigate processing differences between synthetic and analytic comparatives. Indeed, he reports shorter reaction times for the sentences containing analytic comparatives, but due to the experimental design, this evidence is only indirect and allows for alternative interpretations. As yet, then, there is only limited empirical evidence for the assumption that analytic comparatives are easier to process than synthetic comparatives. In addition, as pointed out by Mondorf (2014, 201), it is still an unresolved issue whether more-support is a response to increased processing loads in production or in perception.

This paper addresses these two issues. First, it presents the results from a perception experiment which tested whether analytic comparatives are indeed easier to process for listeners. Contrary to this hypothesis, the reaction times show that analytic comparatives have a processing disadvantage in perception. Then, a production experiment is discussed which elicited spoken sentences containing a comparative construction. The analysis reveals that the processing complexity is a significant predictor of the comparative alternation: with increasing complexity of the base adjective, the probability of analytic comparatives increases. Thus, the paper argues that speakers and listeners process the English comparative variants differently, and that it is the speaker who benefits from a compensatory use of more comparatives.

Method

Comparative variation in perception 31 native speakers of Canadian English participated in an auditory decision task in which they had to decide whether the acoustic stimuli was an existing English form. The set of stimuli contained the analytic and synthetic comparative form for 60 adjective types with at least 5 attestations for both forms in the Corpus of Contemporary American English (Davies 2008-) . The stimuli were produced by a male speaker of Canadian English with phonetic training. He was instructed to produce the stimuli in citation form with a single accent on the primary stressed syllable of the base adjective in both types of stimuli. Accordingly, more was produced stressed, but unaccented.

Alongside the 2 × 60 = 120 synthetic and analytic comparatives, the set of stimuli also included 360 distractors. Some of the distractors combined more with non-existing words, others combined the adjective bases with the illegal suffix -ic. In addition, the set of distractor items contained nonexisting words ending in -er as well as existing words and complex words. Examples of the test stimuli are given in (1a), and distractor examples are given in (1b). (1) a. colder, happier, yellower

more cold, more wealthy, more yellow b. ∗coldic, more ∗gorsty, ∗rilker

on wire, chasting 2.2

Results

The density estimate suggests that reaction times are, in general, higher for analytic comparatives than for synthetic comparatives. This visual interpretation is supported by a linear mixedeffects regression model with reaction times as the dependent variable (in order to fulfill the linearity assumption of the linear model, the reaction times were power-transformed with λ = -1.52, see Box and Cox 1964) . The main predictor was the factor Class (with values Synthetic and Analytic). Additional predictors addressed several influences that may be expected affect the reaction times: the subject-specific variables Handedness, Sex, and Age, the experimental variables Trial number and Reaction time in previous trial, (Preceding RT, see Baayen and Milin 2010 for a discussion) , phonological variables (Metrical structure of base, residualized Number of phonems), and the lexical variables Number of phonological neighbours, Mean RT of base adjective, residualized Phonological Levenshtein distance (PLD20, all three from Balota et al. 2007) , Age of acquisition (from Kuperman et al. 2012) , Frequencies of base, Analytic comparative, Synthetic comparative (from COCA), Inflectional entropy (cf. Moscoso del Prado Mart´ın et al. 2004) . With the exception of the three Subject predictors, the initial model contained interactions between Class and the other predictors. Finally, random intercepts were included for the factors Subject and Adjective base.

After removal of insignificant predictors, the final model reports significant interactions between stimulus Class and Preceding RT, PLD20, Number of phonemes, Synthetic frequency, and Analytic frequency. Figure 2 displays the partial effects for these interactions. The vertical axis shows the transformed reaction times; higher values correspond to longer reaction times.

In agreement with figure 1, the partial effects reveal significantly lower estimates for the synthetic stimuli (solid lines) than for the analytic stimuli (dashed lines). This is true even in the most adverse conditions (e.g. in cases in which the synthetic comparative of a comparative is attested only very rarely in a linguistic corpus, left edge of lower right panel in figure 2).

Comparative variation in production 3

3.1 41 native speakers of Canadian English participated individually in a spoken sentence completion task. The task used the same set of 60 adjectives as in the perception experiment above, but none of the participants in the production experiment had also participated in the previous task. Participants were first shown a context sentence containing the adjective in the positive. After a key press, an incomplete target sentence containing a blank and one or more target words appeared also on the screen. The participants were instructed to use the target words to fill the blank in the sentence. If necessary, they could also use additional words to complete the sentence. The sentences were constructed in such a way that a comparative construction was the most likely target for completion, but participants were not explicitly instructed to use comparatives. The structure of the incomplete sentences was the same in all trials. The subject was a simple noun phrase, followed by a copula verb. The blank to be filled followed in predicative position. This design ensured that the context-dependent factors reported in the literature such as the increased probability of analytic comparatives in predicative position were held constant for all adjectives. Example (3) shows the experimental trial for the target adjective wealthy. (2) The duke is wealthy.

Yet, the king is WEALTHY .

The experiment also contained 105 distractor trials that had a similar structure, but which did not contain adjectives as the target words. 3.2

Reaction times

In order to be able to investigate the effect of the processing complexity of the base adjective on the preferred comparative variant, the same 41 speakers first participated in a visual lexical decision task that gathered reaction times for the 60 target adjectives, as well as 150 other existing and nonexisting distractor items. The participants were not informed about the purpose of this task, and there were at least 14 days for each participant between the lexical decision task and the production experiment. The reaction times obtained in this task were pooled for each adjective, and the median was calculated. 3.3

Results

For most of the adjectives, the completion task was successful in obtaining comparative responses from the 41 speakers. However, two participants produced hardly any comparative in the task, and were therefore excluded from the data set. 6 out of the 60 adjectives were excluded because the responses contained almost exclusively synthetic or analytic comparatives, or because the context sentence did not elicit a considerable number of comparative responses. 747 out of the remaining 39 × 54 = 2106 responses contained a synthetic comparative (35 %), 843 contained an analytic comparative (40 %). The remaining 516 responses (25 %) did not contain a comparative construction, and were discarded. There was notable variation between the two variants both across and within items, which indicates that English comparative variation is indeed a highly non-deterministic field that is apparently affected by both speakerdependent and adjective-dependent factors.

Logistic general additive mixed-effects models (cf. Wood 2006) were used to investigate the relation between the median RTs and the individual responses. These models have the advantage of revealing statistically significant effects of the independent variable on the dependent even if the relation between them is not a linear one. For instance, there could a threshold in the reaction times up to which speakers strongly prefer the synthetic comparative, but beyond which they shift to analytic comparatives in a nearly categorical way. In such a case, a linear model might fail to detect this nonlinear effect of RTs on the responses.

Two models were fitted: a null model which contained only a random effect for speaker, and

Discussion and conclusion a model with an additional smooth term for the effect of the median RTs. If processing complexity has a notable effect on speaker responses, the smooth term should turn out to be statistically significant, and the predictive accuracy of the model should improve by the addition of the term. As table 1 shows, this is indeed the case. While the null model has a total predictive accuracy of about 69 %, the addition of the smooth term for median RTs increases the accuracy by 5.6 %. There is a larger increase of predictive accuracy for analytic responses than for synthetic responses (7.1 % vs. 3.9 %).

Acknowledgments

This work was supported by the Deutsche Forschungsgemeinschaft (grant KU 2896/1-1). I wish to thank Ben Tucker (University of Alberta, Edmonton) for making available to me the facilities of the Alberta Phonetics Laboratory for the experiments reported in this paper.

Harald Baayen and

Petar

Milin . 2010 . Analyzing reaction times . International Journal of Psychological Research , 3 ( 2 ): 12 - 28 .

David A.

Balota ,

Melvin J.

Yap ,

Michael J.

Cortese , Keith A. Hutchison , Brett Kessler, Bjorn Loftis, James H. Neely , Douglas L. Nelson, Greg B. Simpson , and Rebecca Treiman . 2007 . The English Lexicon Project . Behavior Research Methods , 39 ( 3 ): 445 - 459 .

Jeremy

Boyd . 2007 . Comparatively speaking. A psycholinguistic study of optionality in grammar . Ph.D. thesis , University of California, San Diego.

George E. P.

Box and

David R.

Cox . 1964 . An analysis of transformations . Journal of the Royal Statistical Society. Series B , 26 ( 2 ): 211 - 252 .

Mark

Davies . 2008 -. The Corpus of Contemporary American English (COCA): 450 million words, 1990-present . Available online at http://corpus.byu.edu/coca/.

Martin

Hilpert . 2008 . The English comparative. language structure and language use . English Language and Linguistics , 12 ( 3 ): 395 - 417 .

Victor

Kuperman , Hans Stadthagen-Gonzalez,

and Marc

Brysbaert . 2012 . Age-of-acquisition ratings for 30,000 English words . Behavior Research Methods , 44 ( 4 ): 978 - 990 .

John H. McWhorter . 2001 . The world's simplest grammars are creole grammars . Linguistic Typology , 5 : 125 - 166 .

Britta

Mondorf . 2009 . More support for more-support . John Benjamins , Amsterdam.

Britta

Mondorf . 2014 . Apparently competing motivations in morpho-syntactic variation . In Brian MacWhinney, Andrej Malchukov, and Edith Moravcsik, editors, Competing motivations in grammar and usage , pages 209 - 228 . Oxford University Press, Oxford.

Ferm´ın Moscoso del Prado Mart ´ın, Aleksandar Kosti c´, and

R. Harald

Baayen . 2004 . Putting the bits together . An information theoretical perspective on morphological processing. Cognition , 94 ( 1 ): 1 - 18 .

Benedikt

Szmrecsanyi . 2005 . Language users as creatures of habit: A corpus-based analysis of persistence in spoken English . Corpus Linguistics and Linguistic Theory , 1 ( 1 ): 113 - 150 .

Simon N.

Wood . 2006 . Generalized Additive Models . An introduction with

Chapman & Hall/CRC, Boca Raton, FL.

Melvin J. Yap , Sarah E. Tan, Penny M. Pexman , and Ian

Hargreaves . 2011 . Is more always better? effects of semantic richness on lexical decision, speeded pronunciation, and semantic classification . Psychonomic Bulletin & Review , 18 ( 4 ): 742 - 750 .