<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jim Blevins</string-name>
          <email>jpb39@cam.ac.uk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Petar Milin</string-name>
          <email>petar.milin@uni-tuebingen.de</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Ramscar</string-name>
          <email>michael.ramscar@uni-tuebingen.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Eberhard Karls Universität Tübingen</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Cambridge</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Novi Sad, Eberhard Karls Universität Tübingen</institution>
        </aff>
      </contrib-group>
      <fpage>29</fpage>
      <lpage>31</lpage>
      <abstract>
        <p>ihTs talk outlines how form variation can be modelled in terms of equilibria between two dominant communicative pressures. eTh pressure to discriminate forms of a language enhances differences between expressions. Unchecked, this pressure can in principle lead to suppletion of the kind reported in languages such as Yélî Dnye (Henderson ). However, in most languages, the pressure towards maximally discriminative expressions is countered by the need to extrapolate from sparse input. It has long been known that corpora provide only a partial coverage of the forms of a language (inflectional and derivational). iThs talk presents evidence that the shortfall is far greater and far more systematic than previously appreciated, and that the coverage of the form variation remains sparse in corpora of up to one billion words. ehT sampling reported in this talk suggests that the forms in a corpus or encountered by a speaker exhibit a Zipfian distribution at all sample sizes. ehT interaction of these pressures also accounts for the role of lexical neighbourhoods. Since most paradigms will be only partially attested, the organization of paradigms into neighbourhoods provides an analogical base for extrapolation.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>It is usually assumed that regularity in a linguistic
system is desirable or normative and that
suppletion and other irregularities represent deviations
from the uniform patterns that systems (or their
speakers) strive to maintain. From a
discriminative perspective, the situation is exactly reversed.
To the extent that patterns like suppletion enhance
the discriminability of forms, they contribute to the
communicative efficiency of a language. In a
discriminative model, such as that of Ramscar et al.
(), the only difference between overtly
suppletive forms such as mouse/mice and more regular
forms such as rat/rats is that the former serve to
accelerate the rate at which a speakers’ representation
of a specific form/meaning contrast becomes
discriminated from the form classes that express
similar contrasts. uThs all learning serves to increase
the level of suppletion in form-meaning mappings.</p>
      <p>Moreover, standard cases of ‘suppletion’ are
merely extreme instances of discriminative
contrasts that seem ubiquitous at the sub-phonemic
level. In the domain of word formation, Davis
et al. () found suggestive differences in
duration and fundamental frequency between a word
like captain and a morphologically unrelated
onset word such as cap. Of more direct relevance
are studies of inflectional formations. Baayen et al.
() found that a sample of speakers produced
Dutch nouns with a longer mean duration when
they occurred as singulars than as when they
occurred as the stem of the corresponding plural. In
a follow-up study, Kemps et al. () tested
speakers’ sensitivity to prosodic differences, and
concluded that “acoustic differences exist between
uninflected and inflected forms and that listeners are
sensitive to them” (Kemps et al. : ). Recent
studies by Plag et al. () find similar contrasts
between phonemically identical affixes in English.
hTe role of discriminability
From a discriminative perspective, it is regularity
that stands in need of explanation. Learning
models offer a solution here as well. Unlike derivational
processes, inflectional processes are traditionally
assumed to be highly productive, defining uniform
paradigms within a given class. Lemma size is thus
not expected to vary, except where forms are
unavailable due to paradigm ‘gaps’ or ‘defectiveness’.
Yet corpus studies suggest that this expectation
is an idealization. Many potentially available
inflected forms are unattested in corpora. As corpora
increase in size, they do not converge on uniformly
populated paradigms. Instead, they reinforce
previously attested forms and classes while
introducing progressively fewer new units. As shown in
In order for a collection of partial samples to
allow the generation of unattested forms, the forms
that speakers do know must be organized into
systematic structures that collectively enable the scope
of possible variations to be realized. eThse
structures correspond to lexical neigbourhoods, whose
effects have been investigated in a wide range of
psycholinguistic studies (Baayen et al. ; Gahl
et al. ). From the present perspective,
neighbourhoods are not independent dimensions of
lexical organization but, rather, constitute the
creative engine of the morphological system,
permitting the extrapolation of the full system from
partial patterns. Interesting support for this
perspective comes from the study reported in Milin et al.
(). In this study, analogical extrapolation from
a small set of nearest neighbors allowed a system to
model the choice of masculine instrumental
singular allomorph by Serbian speakers presented with
nonce words. Regular paradigms thus enable
language users to generate previously unencountered
forms, not because they are the product of an
explicit rule, or of any kind of explicit grammatical
knowledge, but rather they are implicit in the
distribution of forms and semantics in the language as
a system, much as suggested by Hockett (: ).
in his analogizing … [t]he native user
of the language … operates in terms of
all sorts of internally stored paradigms,
many of them doubtless only partial
Gahl, S., Yao, Y. &amp; Johnson, K. (). Why
reduce? Phonological neighborhood density and
phonetic reduction in spontaneous speech.
Journal of Memory and Language (), –.
Henderson, J. E. (). Phonology and Grammar
of Yele, Papua New Guinea. Pacific Linguistics
B, Camberra: Pacific Linguistics.</p>
      <p>Hockett, C. F. (). eTh Yawelmani basic verb.</p>
      <p>Language , –.</p>
      <p>Kemps, J. J. K., Rachèl, Ernestus, M., Schreuder, R.
&amp; Baayen, R. H. (). Prosodic cues for
morphological complexity: eTh case of Dutch plural
nouns. Memory &amp; Cognition (), –.
Milin, P., Keuleers, E. &amp; Filipović Đurdjević,
D. (). Allomorphic responses in Serbian
pseudo-nouns as a result of analogical learning.</p>
      <p>Acta Linguistica Hungarica , –.</p>
      <p>Plag, I., Homan, J. &amp; Kunter, G. ().
Homophony and morphology: eTh acoustics of
word-final S in English. Ms,
Heinrich-HeineUniversität, Düsseldorf.</p>
      <p>Ramscar, M., Dye, M. &amp; McCauley, S. M. ().</p>
      <p>Error and expectation in language learning: eTh
curious absence of mouses in adult speech.
Language (), –.
1</p>
      <p>2 3
Number of noun infl. variants
4
1M
3M
12M</p>
      <p>15M
6M 9M</p>
      <p>Number of forms
12.5
5.0
−3.0
1 2 3 4 5 6 7 8 9 10 11 12 ... ...</p>
      <p>m
sampleSize
1M
3M
6M
9M
12M
15M
Sample sizes
(and number of hapax legomena):
1M (1107)
3M (2305)
6M (3187)
9M (8035)
12M (8633)
15M (7365)</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>