=Paper= {{Paper |id=Vol-1348/maics2013_paper_16 |storemode=property |title=Metathesis and the Genetic Algorithm: Language as a Complex Adaptive System |pdfUrl=https://ceur-ws.org/Vol-1348/maics2013_paper_16.pdf |volume=Vol-1348 |dblpUrl=https://dblp.org/rec/conf/maics/PalmaGOLG13 }} ==Metathesis and the Genetic Algorithm: Language as a Complex Adaptive System== https://ceur-ws.org/Vol-1348/maics2013_paper_16.pdf
         Metathesis in English and Hebrew: A Computational Account of
                             Usage-Based Phonology
                     Paul De Palma                                                       Sara Ganzerli
                  depalma@gonzaga.edu                                                 ganzerli@gonzaga.edu
  Department of Computer Science, Gonzaga University                     Department of Civil Engineering, Gonzaga University
              Spokane, WA 99258-0026                                                Spokane, WA 99258-0026

                  Shannon Overbay                                                          George Luger
                overbay@gonzaga.edu                                                      luger@cs.unm.edu
     Department of Mathematics, Gonzaga University                                 Department of Computer Science
              Spokane, WA 99258-0026                                                 University of New Mexico
                                                                                      Albuquerque, NM 87131
                   Kim Glaspey
          kglaspey@zagmail.gonzaga.edu
 Department of Computer Science, Gonzaga University
               Spokane, WA 99258-0026

                                                                    seventies, and especially with the wide availability of
                            Abstract                                digitized corpora of spoken language and inexpensive
  It is now well understood that language use shapes the            computing power, the study of language as it is actually
  acoustic delivery of phonological patterns. One common            used has been receiving more attention. Several of the ideas
  example of this type of language change-under-use is              of usage-based linguists have particular implications for the
  metathesis, which is the reversal of the expected linear          study of sound systems. These include the notion that
  ordering of sounds. The gradual transformation of the Spanish     experience with categories of sound affects their
  word chipotle to chipolte in the United States is an example of   representation: the more experience the easier the access.
  metathetic change. The Genetic Algorithm (GA) is an
  optimization technique loosely based on the idea of natural
                                                                    Closely related are the ideas that what we know about
  selection. This paper shows that the GA can provide a             categorization generally applies to phonological structures
  computational model of a usage-based account of examples of       (see Rosch 1978, of course). Further, there is no firm
  metathesis. In the process, it argues that computer models can    separation of language structures and the rules that are
  bring precision to linguistic theory. As an example we create     applied to them—data structures and algorithms in the
  a GA that is able to characterize metathesis in English and       language of computer science—as in the formalist tradition
  then is able to achieve even better results for related           (Chomsky and Halle 1968; Pinker 1999), but, rather,
  expressions in modern Hebrew.
                                                                    linguistic properties emerge from the complex interplay of
                                                                    particular languages and their use, just as do purely
  Keywords: Genetic Algorithm; metathesis; computational            biological systems. In fact, in this view, language emerges
  phonology; emergent                                               from repeatedly applying underlying and general cognitive
                                                                    mechanisms (Bybee, 2010). Finally, and more generally, a
                 Usage-Based Linguistics                            correct formal characterization of language, individually or
In the first paragraph of her book on usage-based                   collectively, may not be possible and even if it were, the
phonology, Joan Bybee says that “language use plays a role          formalism itself does not constitute an explanation of the
in shaping the form and content of sound systems…[It]               phenomenon under investigation. Rather, as Bybee and
affects the nature of mental representation and in some cases       McClelland argue (2005), formalisms describe linguistic
the actual phonetic shape of words” (Bybee 2001, p. 1). A           regularities that result from the normal process of language
non-linguist might reasonably reply, “of course, what else          use and adaptation.
besides use and anatomy could shape sound systems?”
Professor Bybee could then show us an elegant but deeply
counterintuitive body of work, beginning with that of de
                                                                                Hume’s Model of Metathesis
                                                                      Elizabeth Hume’s (2004) study of metathesis is an
Saussure in the early 20th century, which argues that
                                                                    especially nice example of the application of usage-based
language use can be separated from language competence
                                                                    techniques to a phenomenon that has puzzled linguists for
and, crucially, language competence is where the real action
                                                                    many years. (All examples of metathesis in this paper are
is.
   While granting the richness of the formalist program in          taken from Hume). Hume defines metathesis as “the
                                                                    process whereby in certain languages the expected linear
language study, those of us coming from other disciplines
                                                                    ordering of sounds is reversed under certain conditions.
might be pleased to learn that beginning in the mid-nineteen
Thus, in a string of sounds where we would expect the              lexicon. In evolutionary terms, an indeterminate speech
ordering to be …xy…, we find instead …yx…” (p. 203).               signal is one that is not optimally suited to its environment,
For example, in recent American usage, the word chipotle,          the “existing patterns of the language.” It is important here
can frequently be heard as chipolte, where /t/ and /l/ are         to clarify a common misconception about natural selection.
shifted. A very similar kind of metathesis occurs in binyan        It does not claim that a given organism is optimized, that it
5 of perfective verbs in modern Hebrew. When the /-t-/             manifests the best possible arrangement of parts. The
indicating the binyan 5 morpheme is followed by a stem             theory does claim that differential reproduction allows an
initial strident (/s/ or /z/, for example), the morpheme and       organism which is better adapted to a specific and limited
the strident shift expected positions. Thus we have                environment to produce more offspring than one that is not.
hitnakem (“he took revenge”) and hidbalet (“he became              So, biology is neither random nor goal-directed. Hume
prominent”) but, also, histader (“he got organized”) and           makes a similar point about metathesis: “the goal of
hizdaken (“he grew old”).                                          metathesis is not to improve the overall psychoacoustic (i.e.,
   Perhaps the most perplexing element is that a pattern of        universal) cues of a sequence, but rather conforming to the
sounds occurring in one order in language A can occur in           patterns of usage of a given language is key” (p. 225).
the opposite order in language B. Consider examples drawn          These two ideas, that frequency of use plays a role in
from Hungarian and Pawnee. In certain Hungarian forms,             language development—see especially, Bybee 2010)—and
glottals that precede approximants surface as approximants         that metathesis can be reframed as an emergent
preceding glottals (/h/ + /r/, in this case, becomes /r/ + /h/).   phenomenon, are the ideas that interest us most and that put
Thus the dative tehernek (“load”) becomes in the plural            Hume’s account squarely within the usage-based camp.
terhek. In Pawnee, just the opposite occurs. The expected
ordering /ti-ir-hissask-kus/ becomes tihrisasku, with the                   Emergentist Models of Language
glottal appearing before the approximant. According to             The view that language is emergent, that it is, in fact, a
Hume, this led metathesis to be analyzed as a phenomenon           complex adaptive system, has received attention in recent
that is irregular, found in child language, the result of          years. One of the earliest accounts is Lindblom’s 1984
performance errors, or simply the result of language change.       attempt to select “with the aid of a self-organizing model a
   In fact, implicit in her discussion, though distinctly          ‘phonological structure’” [emphasis in the original]. In fact,
underplayed, is that metathesis leads to permanent language        a snippet from that article, “DERIVE LANGUAGE FROM
change. That is, metathesis is a diachronic phenomenon.            NONLANGUAGE!,” has been used recently used as a
This raises metathetic change from a mere curiosity whose          summary of the goals of usage-based linguistics (Diessel
regularities can be described to an element of language            2011). More recently, Ke and Holland (2006) note that
change. And, as Joan Bybee, a leading figure in the usage-         there are two main approaches to the investigation of
based camp reminds us in her recent book, “nothing in              language origins. First, there are nativist accounts of
linguistics makes any sense except in light of language            language competence and performance that concentrate on
change” (Bybee 2010, p. 10). Although the pronunciation            cognitive mechanisms and their biological underpinnings.
of /chipotle/ as /chipolte/, not simply within a linguistic        Then there are empirical accounts that concentrate on social
generation but within a single speaker, can be accounted for       structures and patterns of linguistic transmission. In the
by her model, Hume’s work becomes really interesting               latter, “language could have evolved from simple
when it tries to account for what was once a puzzling aspect       communication systems through generations of learning and
of linguistic change. How, for instance, did the expected          cultural transmission, without new biological mutations
/hitsader/ in Modern Hebrew become /histader/? Though              specific to language. While the human species may have
diachronic processes are not her primary interest, Hume’s          evolved to be capable of learning and using language, it is
account of metathesis can be reframed in evolutionary              more important to recognize that language itself has evolved
terms. What any naturally selective process needs is an            to learnable for humans” (Ke and Holland 2006. p. 693).
initial state, an environment that favors certain forms over          Andrew Wedel (2005) offers a nice analogy. It seems
others, and an output. Hume’s work provides all three. The         unreasonable to assert that one’s ability to hold a fork is
initial state, of course, is “the expected linear ordering of      genetically encoded in any precise fashion, despite that fact
sounds.” The output is the reverse ordering. The “certain          that humans, as far as is known, are the only species to use
conditions” correspond to the phonological environment             them. On the other hand, the manner of fork-holding is
that favors some forms over others.                                culturally     transmitted    within     genetically-encoded
   Hume argues that metathesis requires two conditions:            parameters, namely four fingers and an opposable thumb.
            An indeterminate speech signal                        We might even become better fork-holders over time, as our
            An output that conforms to existing patterns in       forks evolve to fit our gifts. This notion, that linguistic
                the language.                                      transmission occurs within species-specific parameters, is
This is another way of saying that if I don’t quite understand     captured in the emergentist paradigm. As Ellis put it (cited
what you just said, I’ll interpret in light of what I already      in Ke and Holland, 2006, p. 694), language acquisition can
know. My reinterpretation, of course, will be in the context       be explained by “simple learning mechanisms, operating in
of what I know best, namely the most frequent sounds in my         and across the human systems for perception, motor-action,
and cognition as they are exposed to language data as part of       Palma, P., 2006). In practice, of course, this means that
that communicatively-rich human social environment by an            those who attempt to solve these problems must be content
organism eager to exploit the functionality of language”            with good-enough solutions. Though good-enough may not
(Ellis 1998, p. 657).                                               appeal to purists, it is exactly the kind of solution implicit in
  Both Holland and Ke (2006) and Holland (2005) situate             natural selection: a local adaptation to local constraints,
their work within the tradition of agent-based and complex          where the structures undergoing change are themselves the
adaptive systems. Holland—the original developer of the             product of a recursive sequence of adaptations. This can be
genetic algorithm (Holland 1975)—describes his own                  expressed quite compactly:
efforts to model language acquisition as a complex adaptive
system. He uses the phrase “adaptive agent” to describe an            GA()
                                                                        Initialize(population); //build initial population
individual collection of linguistic rules that communicates
                                                                        ComputeCost(population); //apply cost function
with what appears to be a linguistic environment. Some of               Sort(population); //rank population
these agents have a better fit with the environment than                while (population has not converged on a good-enough solution)
others. These survive to evolve still better rules.                               Pair(population); //decide which members reproduce
                                                                                  Mate(population); //exchange characteristics
  Though these accounts are persuasive enough, the real
                                                                                  Mutate(population); //randomly perturb genes
question to be addressed is what one gets after one creates a                     Sort(population); //rank population
software model of larger system. O’Reilly and Munakata                            TestConvergence(population); //has a new species appeared?
(2000) make an especially persuasive argument for why one
might want to model cognitive processes, the most                      The use of the GA to model metathetic change is
important piece of which for our own work is that models            consistent with Croft’s (2000) theory of language change
force investigators to be explicit about their theories. It is      that he calls “utterance selection.” In utterance selection,
one thing to describe a process. It is quite another to             “normal replication is in essence conformity to convention
describe it with the precision necessary to run it on a             in language use. Altered replication results from the
computer. Thus Hume draws on Ohala’s (1993) observation             violation of convention in language uses. And selection is
that certain categories of sound, glottals and liquids for          essentially the gradual establishment of a convention
example, (i.e., the closure of the glottis in bitten and /r/, see   through language use” (p. 7). In Croft’s view, the utterance
Ladefoged 2006) have “stretched out features” that can              corresponds to DNA, the replicators to genes, the variants in
bleed over into adjacent sounds causing indeterminancy              linguistic structures to alleles. The task in building a model
(Hume, 2004, p. 219). To construct a computer model, we             is to find, according to Croft, those mechanisms that cause
would have to know how stretched out. Glottals have cues            certain linguistic structures to be favored over others.
that are certainly longer than the release bursts of stops (/b/     These are “the causal mechanisms of selection of linguistic
for example). But how much longer? An empirical                     structures” (p. 31).      Hume’s work provides just such a
approach suggests itself immediately: conduct experiments.          causal mechanism. We show next that this causal
Another approach, the one implicit in emergentist theory, is        mechanism can be modeled with GA.
to build a model and adjust its parameters until its inputs
and outputs conform to the data. In a nutshell, this is what                          Metathesis and the GA
guides our efforts.                                                    Hume describes several kinds of metathesis, all
                 The Genetic Algorithm                              conforming, in one way or another, to her initial claim that
The Genetic Algorithm (GA) is an optimization method                metathesis results from indeterminate speech signals
based loosely on the idea of natural selection. Individual          processed in terms of frequently occurring sequences of
members of a species who are better adapted to a given              sounds in a given language. The chipotle/chipolte example
environment reproduce more successfully and so pass their           is an instance of this recurring pattern: “a consonant with
adaptations on to their offspring. Over time, individuals           potentially weak phonetic cues often emerges in a context in
possessing the adaptation form interbreeding populations,           which the cues are more robust than they would have been
that is, a new species. In keeping with the biological              in the expected, yet non-occurring, order” (p. 209). More
metaphor, a candidate solution in a GA is known as a                specifically, stop consonants are easier to perceive in
chromosome. The chromosome is composed of multiple                  prevocalic position. In fact, over one-third of the metathesis
genes. A collection of chromosomes is a called a                    tokens that Hume identifies involve a stop consonant. In the
population. The GA randomly generates an initial                    example, [tle] is less favorable in the environment of
population of chromosomes which are then ranked                     American English than is [lte]. That is, the stop consonant
according to a fitness function. One of the truly marvelous         before the /l/ produces an indeterminate signal for American
things about GA is its wide applicability. We have used it to       English speakers, who proceed to shift it to the more
optimize structural engineering components and are                  frequent pre-vocalic position.
currently applying it to a classic problem in graph theory             How to represent this process in a GA is the next
(Ganzerli, S., De Palma, P. et al., 2003, 2005, 2008). As it        question. Clearly, we must assign a better fitness, a lower
happens, both problems are NP-Complete, in effect,                  cost, to sequences with pre-vocalic stop consonants than to
computationally intractable (Overbay, S., Ganzerli, S, De           those with post-vocalic stop consonants. But, somehow,
both signal indeterminacy and token frequency must be                5.   A stop followed by a strident is perceptually weak
made part of this process. Here is our approach:                          and infrequent. Penalize words with prestrident
                                                                          stops. This rule is what allows our GA to generate
  1.   Input an initial population of the base word and the               the kind of metathetic change found in binyan 5 of
       target word. chipotle is an example of a base word                 perfective verbs in Modern Hebrew (/hitsader/ 
       and chipolte is an example of the target word. Our                 /histader/) as well as another instance of English
       GA works with a total population of 64 words. The                  metathesis (/ask/  /aks/).
       relative frequency of the base and target words is a
       parameter. Thus, we might have one instance of the                          Method and Results
       base and four of the target in the initial population.    Our GA was constructed using Java programming and run
   2. Generate a random sequence of characters that fill         under Ubuntu Linux. Its cost function is designed to model,
       out the population. So, if we seeded the population       among many other instances, both the chipotle/chipolte
       with one instance of the base and four of the target,     metathesis as well as binyan 5 of perfective verbs in modern
       our GA would randomly generate fifty-nine character       Hebrew, specifically hitsader/histader.       Every parameter
       sequences.                                                was held constant except the relative frequency of base and
   3. Assign a fitness value to each of the sequences that       target sounds. Since the sounds being modeled occur in the
       comprise the population.                                  interior of the word in both cases, the strings potle/polte and
   4. Sort, pair, mate, and mutate the population. Sorting       itsa/ista functioned as surrogates for the entire words. The
       is the process of ranking by fitness value. Pairing is    population size was set at 64 and the mutation factor set at
       the process whereby strings of sounds are collected in    .5%.      For each of 1, 2, and 4 initial chipotle/hitsader
       two-tuples. As a proof of concept, we adopt a simple      tokens, the number of chipotle/histader tokens began at
       approach. The two-lowest cost strings are paired,         parity then was doubled three times. So, for instance, if we
       followed by the next two lowest cost until we have        were working with an initial population of 4 chipotle tokens,
       16 breeding pairs. The remaining 32 strings are           we would produce results for 4, 8, 16, and 32 chipolte
       discarded to make room for the progeny of our             tokens. Therefore, there were 12 frequency configurations,
       breeding pairs. Mating is the process by which the        four for each set of 1, 2, or 4 chipotle tokens. For each of
       paired words pass on their genetic composition—           these 12 configurations, we ran the GA 250 times, each run
       their sounds—in the process of generating two new         consisting of 250 generations.         Along the way, the
       strings of sounds. Mutating is the random shifting of     chipotle/hitsader tokens disappeared.            The data is
       a fixed fraction of the genes in the population. This     summarized in the Tables 1 and 2 below.
       mimics the action of chemical/biological/radiological
       mutagens on individuals. For our purposes, it                        Discussion and Future Research
       prevents the system from getting stuck in local              The data illustrates that we were able to design a
       minima (see Haupt & Haupt 1998).                          computational model using the Genetic Algorithm that
   5. Stop when some predetermined condition is met, else        captures Hume’s model of metathetic change. In every one
       go to step 3.                                             of the 12 frequency configurations, the chipotle tokens
   The cost function in any GA embodies most of the theory       disappeared from the population within three generations
being modeled. The other pieces are parameters to the            and hitsader tokens within two. “Generation,” of course is
system. The most important of these for us is the relative       the term used in the GA literature. It is not to be confused
frequency of the base word and the target, i.e., the initial     with a human generation. Further, within 60 generations, on
character sequence and the target of metathetic change           average, chipolte tokens made up an average of 95% of the
respectively. The cost function itself is an attempt to          population. Hebrew metathesis performed even better, with
operationalize Hume’s model. Except for a few items              histader tokens comprising an average of 97.3% of the
designed to exclude randomly generated but non-occurring         population within, on average, 48 generations.        At this
phonetic sequences, it is as follows:                            point, it might be useful to recall Hume’s two conditions for
     1. A prevocalic stop is more salient than a postvocalic     metathesis: the speech signal must be indeterminate, and the
          stop. Give a fitness boost to words with prevocalic    output must conform to existing patterns in the language. As
          stops.                                                 we indicated with the Hungarian and Pawnee attestations
     2. By observation 1, penalize words with postvocalic        above, metathesis is not just a rule-based phenomenon
          stops.                                                 found in the same form cross-linguistically. Rather, it is
     3. Glottals, liquids, glides (/w/, for example) tend to     intimately tied to existing sound patterns within a language.
          bleed over into adjacent sounds . This is especially   Said another way, metathesis is a usage-based phenomenon.
          true when they follow a stop. Penalize words with      Our model demonstrates this in terms of a very solid
          glottals, liquids, and glides that follow a stop.      frequency effect. The maximum number of target tokens
     4. A stop followed by a consonant is perceptually           tends to stabilize more quickly and at a higher percent of the
          weak. Penalize words with stops followed by            total population as the number of target tokens in the initial
          consonants.                                            population increases. Further, the larger the set, where a set
is defined as the number of base tokens in the initial               Table 2: Hitsader, 250 Runs, 250 Generations Each
population, the better the performance. This is illustrated
                                                                         Ratio      Generation     Generation        Percent of
most strongly when we look at data from the first and last
                                                                     of Base to        Hitsader      Histader          Histader
element of each configuration; that is when we compare 1:1,             Target     Disappeared      Stabilized       Tokens at
2:2, and 4:4 with 1:8, 2:16, and 4:32. The more frequent the                                                      Stabilization
target within the initial population, the more quickly the             1:1        2                79             84.3
population stabilizes on the target and at a higher percent of         1:2        2                65             98.4
                                                                       1:4        2                55             98.4
the total population.                                                  1:8        2                39             98.4
   Nevertheless, Hume’s model is underspecified from an                2:2        2                59             98.4
algorithmic/computational standpoint. Though it specifies              2:4        2                47             98.4
very clearly what kinds of sounds are potentially vulnerable           2:8        2                43             98.4
to metathetic change and in what context, the computational            2:16       1                33             98.4
                                                                       4:4        2                49             98.4
modeler must guess how to weight the various phonetic
                                                                       4:8        1                43             98.4
factors involved and, in particular, to guess at frequency             4:16       1                32             98.4
thresholds. We regard our study as a proof of concept. In              4:32       1                23             100
future work we will build our frequency hypotheses into the
rules themselves. For example, instead of simply rewarding
strings with a prevocalic stop and penalizing those with a                        Acknowledgements
postvocalic stop, we will use transcribed corpora to estimate    The authors would like to acknowledge the many student
the frequency of both vulnerable cues and the targets of         research assistants who have contributed their talent and
metathetic change. These frequencies will be used to weight      enthusiasm to the Gonzaga University Center for
the penalties and rewards, thus making as precise as             Evolutionary Algorithms for over a decade.
possible important observations like, “Indeterminancy sets
the stage for metathesis, and the knowledge of the sound
patterns of one’s language influences how the signal is                                 References
processed and, thus, the order in which the sounds are           Bybee, J. 2001. Phonlogy and Language Use.
parsed” (Hume, 2004, pp. 209- 210). Our goal is that by            Cambridge: Cambridge University Press.
gathering data on vulnerable sounds in corpora of actual         Bybee, J. 2010. Langauge, Usage and Cognition.
speech, we will be able to generate all of the instances of        Cambridge: Cambridge University Press.
metathesis within a language. This will add weight to            Bybee, J., and McClelland, J. 2005. Alternatives to the
Hume’s observations and perhaps be useful in accounting            combinatorial paradigm of linguistic theory based on
for and predicting other types of language change.                 domain general principles of human cognition. The
                                                                   Linguistic Review 22,381-410.
    Table 1: Chipotle, 250 Runs, 250 Generations Each            Chomsky, N., and Halle, M. 1968. The Sound Pattern of
                                                                   English. NY: Harper and Row.
    Ratio      Generation     Generation        Percent of       Croft, W. 2000. Explaining Language Change: An
of Base to        Chipotle       Chipolte         Chipolte         Evolutionary Approach. Harlow, England: Pearson.
   Target     Disappeared      Stabilized       Tokens at
                                             Stabilization
                                                                 Ellis, N. 1998. Emergentism, Connectionism, and
  1:1        3                119            73.4                  Language Learning. Language Learning 48, 631-634.
  1:2        3                68             93.7                Ganzerli, S., & De Palma, Paul. (2008). Genetic Algorithms
  1:4        3                60             96.8                  and Structural Design Using Convex Models of
  1:8        2                44             98.4                  Uncertainty. In Y. Tsompanakis, N. Lagaros, M.
  2:2        3                90             92.1
  2:4        3                72             96.8
                                                                   Papadrakakis, eds. Structural Design Optimization
  2:8        2                50             98.4                  Considering Uncertainties. London: A.A. Balkema
  2:16       2                31             98.4                  Publishers, A Member of the Taylor and Francis Group.
  4:4        3                58             96.8                Ganzerli, S., De Palma, P., Stackle, P., Brown, A. 2005.
  4:8        2                44             98.4                  Info-Gap uncertainty on structural optimization via
  4:16       2                33             98.4
                                                                   genetic algorithms. Proceedings of the Ninth
  4:32       1                26             98.4
                                                                   International Conference on Structural Safety and
                                                                   Reliability, Rome.
                                                                 Ganzerli, S., De Palma, P., Smith, J., & Burkhart, M. 2003.
                                                                   Efficiency of genetic algorithms for optimal structural
                                                                   design considering convex modes of uncertainty.
                                                                   Proceedings of The Ninth International Conference on
                                                                   Applications of Statistics and Probability in Civil
                                                                   Engineering, San Francisco.
                                                                 Haupt, L. & Haupt, S. 1998. Practical Genetic Algorithms.
                                                                   New York: John Wiley and Sons.
Diessel, H. 2011. Review Articles: Language, usage, and
  cognition. Language 87(4):830-844.
Holland, J. 1975. Adaptation in Natural and Artificial
  Systems. Ann Arbor: The University of Michigan Press.
Holland, J. 2005. Language acquisition as a complex
  adaptive system. In Minett, J., Wang, W. eds. Language
  Acquisition, Change and Emergence: Essays in
  Evolutionary Linguistics. Hong Kong: City University of
  Hong Kong Press.
Hume, E. (2004). The Indeterminancy/Attestation Model of
  Metathesis. Language 80(2): 203-237.
Ke, J., & Holland, J. 2006. Language Origin from an
  Emergentist Perspective. Applied Linguistics 27(4): 691-
  716).
Ladefoged, P. 2006. A Course in Phonetics. Boston:
  Thomson/Wadsworth.
Lindblom, B., MacNeilage, P., & Studdert-Kennedy, M.
  1984. Self-organizing processes and the explanation of
  phonological universals. In Butterworth, B., Comrie, B.,
  Dahl, O. eds. Explanations for Language Universals. New
  York: Mouton.
O’Reilly, R., & Munakata, Y. 2000. Computational
  Explorations in Cognitive Neuroscience: Understanding
  the Mind by Simulating the Brain. Cambridge: MIT
  Press.
Ohala, J. 1993. Sound change as nature’s speech perception
  experiment. Speech and Communication 13.155-161.
Overbay, S., Ganzerli, S., De Palma, P, Brown, A., Stackle,
  P. 2006. Trusses, NP-Completeness, and Genetic
  Algorithms. Proceedings of the 17th Analysis and
  Computation Specialty Conference. St. Louis.
Pinker, S. 1999. Words and Rules. NY: HarperCollins.
Rosch, E. 1978. Principles of Categorization. In E. Rosch
  and B. Loyd (eds.), Cognition and Categorization, 27-48.
  Hillsdale, NJ: Lawrence Erlbaum Associates.
Wedel, A. Contrast Maintenance in Language and the
  Innateness Debate. Retrieved 2/18/2013 from
  http://dingo.sbs.arizona.edu/
  ~wedel/research/PDF/wedelcontrastsummary.pdf