=Paper= {{Paper |id=Vol-1419/paper0016 |storemode=property |title=Simulating Intervention to Support Compensatory Strategies in an Artificial Neural Network Model of Atypical Language Development |pdfUrl=https://ceur-ws.org/Vol-1419/paper0016.pdf |volume=Vol-1419 |dblpUrl=https://dblp.org/rec/conf/eapcogsci/YangT15 }} ==Simulating Intervention to Support Compensatory Strategies in an Artificial Neural Network Model of Atypical Language Development== https://ceur-ws.org/Vol-1419/paper0016.pdf

Simulating intervention to support compensatory strategies in an artificial neural
network model of atypical language development
Juan Yang (jkxy_yjuan@sicnu.edu.cn)
Department of Computer Science, Sichuan Normal University
Chengdu, 610101 China
Michael S. C. Thomas (m.thomas@bbk.ac.uk)
Developmental Neurocognition Lab, Department of Psychological Sciences
Birkbeck, University of London, UK

Abstract to address a developmental deficit (in Harm et al.’s 2003
reading model; Harm, McCandliss & Seidenberg, 2003).
Artificial neural networks have been used to model
developmental deficits in cognitive and language Slightly more work has used ANN models to investigate
development, most often by including sub-optimal input- remediation following acquired damage. For example, in a
output representations or computational parameters in these model of acquired dyslexia, Plaut (1996) considered the
learning systems. The next step is to simulate intervention to degree and speed of recovery through retraining, the extent
alleviate developmental impairments, to inform the to which improvement on treated items generalizes to
mechanistic basis of remediation. Here we used a sample untreated items, and how treated items are selected to
model of atypical language development (in the well-explored
domain of past tense acquisition) to investigate the extent to
maximize this generalization. Abel et al. (2007)
which alternative training regimes may induce short-term or demonstrated how an adult model of aphasia could guide
long-term compensatory changes in underlying function, and actual interventions depending on patients’ error patterns.
the extent to which this depends on the timing of intervention. Even here, however, the work remains limited.
We present a new method to derive ‘intervention’ training The computational approach to development has
sets as a simulation of behavioral interventions, and assess its generated a growing understanding of environmental factors
adequacy in our sample model. that influence learning in typical development (Borovsky &
Keywords: language development; developmental disorders; Elman, 2006; Gomez, 2005; Onnis et al., 2005). This
artificial neural network models; intervention; compensation includes the importance of factors such as the frequency of
training items, their similarity and variability, and the
Introduction provision of novelty in familiar contexts. However, there
Computational models of development, particularly those has yet to be a consideration of how these factors interact
employing artificial neural networks (ANN), have provided with learning systems containing the sorts of atypical
hypotheses about the mechanistic bases of cognitive and computational constraints that lead to impoverished internal
language deficits (Mareschal & Thomas, 2007). For representations and, in turn, behavioral deficits compared to
example, in the domain of language, Harm, McCandliss and typically developing children. It is yet a further step to link
Seidenberg (2003) demonstrated how limited connectivity such an understanding with the diverse activities that tend to
in the phonology component of a reading model produced a be used by clinicians in speech and language therapy,
system with symptoms of dyslexia. In a model of including such activities as modeling, forced alternatives,
inflectional morphology, Thomas (2005) demonstrated how repetition, visual approaches to support oral language, and
shallow sigmoid activation functions yielded processing reducing distractions (Ebbels, 2014; Law et al., 2007).
units that were insensitive to small changes in the input, From the perspective of individual network models
producing networks that exhibited developmental delay. simulating development, where development is conceived of
More recently, Thomas and Knowland (2014) considered as the acquisition of the domain instantiated in the learning
how multiple changes to intrinsic computational properties environment, it is not obvious that ‘behavioral interventions’
and extrinsic environmental factors could produce different could alleviate a developmental impairment that arises
types of language delay that either persisted or resolved over either through inadequate representations or insufficient
developmental time. computational power. Here, we conceive of a behavioral
Progress of this type motivated Daniloff (2002, p.viii), in intervention as representing the addition of some further
his book Connectionist approaches to clinical problems in training items to the normal training set of the network
speech and language, to comment ‘ANN theory will … model. If the model is unable to learn the training set to a
form the backbone of much of language therapy in the near given performance level through limitations in processing
future’. However, research and practice have yet to repay capacity, adding further input-output mappings to the
this optimism (though see Poll, 2011, for renewed attempts training set is unlikely to enhance performance on the
to make these links). Only one computational study has patterns comprising the original training set. What one
systematically explored the efficacy of a single intervention might call normalization through behavioral intervention is

123
therefore difficult if one conceives of developmental deficits Base Model
as arising from limitations in individual systems. We define
We employed a simple simulation of past tense formation
normalization as the acquisition of the abilities and
(Plunkett & Marchman, 1993) as our base model. This
knowledge that any typically developing system acquires
model uses an ANN to learn a quasi-regular mapping
through exposure to the normal training set.
problem instantiated in an artificial language design to have
There are at least three possible responses to this
many of the properties of English past tense formation. The
difficulty in achieving normalization. First, the
learning domain is predominantly characterized by a general
computational properties of the system might be enhanced
rule (add –ed to a verb stem to form its past tense). However,
to enable it to acquire the training set (for instance, for the
there exists a minority of exception or irregular verbs which
actual child, by interventions targeting motivation, or by
form their past tense in different ways, for instance with
pharmacological means; for the network, by altering one or
arbitrary associations between stem and past tense form (go-
more of its parameters).
went), no-change irregulars where the past tense form is the
Second, the intervention might target the input and output
same as the verb stem (hit-hit), and vowel-change irregulars
representations of the system, thereby simplifying the
(sing-sang, ring-rang). The network is required to learn the
computational problem that the network has to learn. Harm,
association of verb stems to their past tense forms.
McCandliss and Seidenberg (2003)’s simulated
Generalization can be tested on whether the network can
phonological intervention for developmental dyslexia
apply the past tense rule to novel verbs, or can apply any of
utilized this method. Best et al. (2015) have recently used a
the irregular patterns to novel verbs sharing similarity to
similar approach to simulate behavioral interventions for
irregulars existing in the training set.
developmental deficits in productive vocabulary,
alternatively targeting phonological or semantic
Simulating atypical development in the base model
representations that represent the two codes that must be
associated in vocabulary development. Three layered ANNs were used to simulate individual
Third, one might take the view that what the atypical children. All the ANNs had 57 input units in the input layer
system needs to learn is not the training set per se (even and 62 output units in the output layer, used to represent tri-
though this is what typical systems acquire), but a general phoneme verb stems and their past tense forms. The 57
function implicit in the items comprising the training set. input units corresponded to the binary encoding of a
Acquisition of this general function can be assessed by monosyllabic three-phoneme verb, where each phoneme
performance on generalization sets rather than the training was represented using 19 binary articulatory features. The
set. There may then be input-output mappings that can be output included the same binary encoding of the tri-
added to the training set which could improve the network’s phoneme string with the addition 5 extra bits to represent
ability to learn the general function, even if performance on the suffix part of the past tense. The encoding is based on
the original training set did not improve (or even worsened). one that proposed by Plunkett & Marchman (1991; P&M91)
One might term this approach compensation, since the aim (see Thomas & Karmiloff-Smith, 2003, for more details of
is to optimize a subset of behaviors present in the original the artificial language, including the consonant-vowel
training set. templates used to generate the artificial verbs).
In this paper, we investigate possible ‘behavioral
interventions’ to encourage compensation (so defined) in a
widely used model drawn from the domain of language
development, that of English past tense formation. This
model has been used to capture developmental trajectories
and error patterns as children acquire English verb
morphology, but it has also been used as a sample
associative system to consider more general issues in
development (see, e.g., use of this model to investigate
sensitive periods development: Marchman, 1993; Thomas &
Johnson, 2006; to investigate population-level individual
differences: Thomas, Forrester & Ronald, 2015). We Figure 1: An ‘atypical’ ANN caused by a capacity deficit of
introduce a method to derive ‘intervention patterns’ that are reducing the number of hidden units from 50 to 15.
added to the training set of atypical networks for a limited
period in development, simulating an intervention given to a The training set comprised 508 artificial verbs, 410
child diagnosed with a developmental impairment. We regular verbs, 20 identical irregular verbs, 10 arbitrary
compare the effectiveness of two different intervention sets irregular verbs and 68 vowel changed irregular verbs.
for improving generalization performance on several Developmental trajectories were simulated by 1000
possible implicit functions that the network might be able to presentations of this training set (epochs). A network with
acquire. 50 hidden units (back propagation algorithm, learning rate
0.1, momentum 0, temperature 1, initial weights randomized
between ±1) proved able to learn the training set within

124
approximately 300 epochs. We implemented a verbs, 10 arbitrary irregular verbs, and 68 vowel-change
developmental deficit by reducing the computational irregular verbs.
capacity of the network (Thomas & Knowland, 2014). • A novel set of 508 arbitrary irregular verbs generated
Piloting indicated that a reduction of hidden unit resources using the P&M91 templates.
to 15 produced a persisting deficit in learning the training • A novel set of 508 no-change irregular verbs generated
set (architecture show in Figure 1). using the P&M91 templates.
• A novel set of 508 vowel-change irregular verbs
Simulating interventions to encourage generated using the P&M91 templates.
compensation in the atypical network For the novel rhyme set, generalization was assessed
We simulated a behavioral intervention to remediate the according to accuracy of producing regularly inflected
developmental impairment in the following way. We forms. For irregular verbs in the shadow training set, and
assumed that the impairment was diagnosed at some point for the three novel irregular sets, generalization was tested
relatively early in development. At this time, additional according to accuracy of producing the target irregular
patterns were added to the original training set. The output form.
intervention set was added to, rather than replaced, the We asked two questions. Did the intervention to
original training set, since we assume that in a clinical encourage a compensatory strategy produce any benefit at
setting, interventions take place against the child’s the immediate end of the intervention period? And if so, did
continued experience of his or her normal learning any benefit persist after the intervention ceased so that it
environment. In ANNs, replacement would also incur the was observable at the end of training? For the earliest
risk of catastrophic interference. We assumed that the intervention, performance was therefore assessed at 80 and
behavioral intervention was much smaller in scale than 1000 epochs. For the latest intervention, performance was
continued everyday experience, and so limited the assessed at 280 and 1000 epochs. In each case, there were
intervention set to 10% the size of the original training set 10 replications of networks with different random seeds.
(50 input-output mappings versus 508 in the original set). This leaves the challenge of how to construct an
Intervention continued for a limited duration (30 epochs of intervention set to encourage a compensatory strategy. In
training), at which point the intervention ceased and training the next section, we propose a method.
reverted to the original set. Intervention had the goal of
encouraging acquisition of the regular past tense rule. A method for generating intervention sets for
We manipulated the timing of intervention, from ‘early’ at compensation
50 epochs in steps of 50 up to ‘late’ at 250 epochs (i.e., 5 The problem we needed to solve is how to choose the most
stages: 50, 100, 150, 200, and 250) compared to the full effective 50 intervention verbs among the hundreds of
training trajectory of 1000 epochs. The importance of early thousands of possible artificial verbs possible within the
intervention has been stressed within a clinical setting, encoding scheme. The idea is intuitive: if we suppose some
under the view that plasticity reduces over time. Simple of the input units are more important and decisive than
feedforward ANNs have been claimed to capture a others, then the intervention verbs can be chosen based on
reduction in plasticity through entrenchment effects these features. Now the problem becomes how to identify
(Marchman, 1993), though a broader set of mechanisms the key features within the original training set among the
may also produce reductions in plasticity, such as synaptic original 57 dimensional input space. The extra data set
pruning (Thomas & Johnson, 2006). should be able to remedy a disordered ANN in its
We assessed normalization with respect to changes in generalization of the past tense rule. Broadly, the approach
performance on the original training set. We assessed we adopted was to take an ANN that had successfully
compensation with respect to changes in performance on learned the past tense problem. We then varied the
five generalization sets. These were: activation level of input units and assessed the extent to
• A novel rhyme set. Each novel verb shared two out of which this might generate the error on the output. This
three phonemes with a verb in the training set. There should indicate the extent which input units encoded key
were 410 regular verb rhymes, 20 no-change irregular dimensions would influence the performance of learning.
verb rhymes, 10 arbitrary irregular verb rhymes, and 76 Formally, we translated this challenge into an optimization
vowel change irregular verb rhymes. Finally, there were problem, specified as:
56 novel verbs only shared one phoneme with any verb ( ( − , ∑ = , = 0 1(1)
in the training set. This novel verb set has been used in
previous simulations (e.g., Thomas & Karmiloff-Smith, In Formula (1), Y is the past tense matrix in the training data
2003). set, while ( is the actual output of the ANN, is
• A shadow training set. These were novel artificial verbs the number of the final layer of the ANN. So, Formula (1)
regenerated using the same consonant-vowel templates attempts to select out the input units that contribute most to
as the original training set (P&M91) and in the same the learning based on the training data set. is a
proportions: 410 regular verbs, 20 no-change irregular recursive function defined in Formula (2):

125
( , !" , # = 1
= (
contain significant numbers of regular verbs which one
$ %" , ! &, # > 1
(2) might expect to aided by the intervention. In both cases,
Since this is a combinatorial optimization problem, we used statistically reliable benefits of intervention were observed.
a Genetic Algorithm (GA) approach to find the optimized Three sets considered the possibility of generalizing
result. In this algorithm, = 25 features. However, after the irregular patterns. Since there is no systematic relation
GA filtered out these features, a further selection was between arbitrary mappings, one would not expect an
necessary, since no artificial verbs could fully satisfied the intervention effect on novel arbitrary verbs, and none was
filtered features. In the final step, a subset of 5 or 6 features found. However, both novel no-change and novel vowel-
were chosen to generate two possible intervention data sets. change generalization sets showed benefits. This implies
that the intervention had better enabled the atypical network
Key features selected to generate the intervention to separate regular and irregular mappings within its
data sets representational space, and so generalize both types of
general function to novel verbs with features that would
After running the GA, two sets of features were constructed. support these functions.
We refer to the first as the GA feature set, since it was We ran a series of omnibus ANOVAs to assess broader
closest to a shortened version of the original filtered 25 patterns. To emphasize the possibility that timing of
features yet consistent with legal verbs within the P&M91 intervention might have an effect, we focused on a
encoding scheme. Novel verbs each contained 5 selected comparison between the earliest intervention point (50
features shown in Table 1. We refer to the second as the epochs) and latest (250 epochs). Figure 2 demonstrates the
Voice satisfied feature set. Novel verbs each contained 6 effect of intervention at intermediate time points. We first
selected features shown in Table 2. These verbs were more examined training set performance, considering factors of
consistent with those present in the original training set in group (treated vs. untreated), intervention type (GA vs. V),
terms of their voicing features. One might think of the first and timing (50 vs. 250 epochs), separately for the immediate
intervention set as optimized but somewhat strange, and the end of intervention and at the end of training. For
second as slightly less optimized but more natural given the performance at the immediate end of intervention, there was
ANNs previous training experiences. a reliable effect of the intervention (F(1,9)=12.96, p=.006),
Fifty novel verbs were created for each intervention set. with an effect size of ηp2=.59, corresponding to a worsening
The target output for each novel verb was the regular of performance. The intervention effect was not modulated
inflected past tense. by intervention type, nor by timing of intervention. For
Table 1: GA Featured intervention data set. performance at the end of training, there was no effect of the
intervention at 1000 epochs.
Corresponding Unit Feature location Meaning
1 first phoneme sonorant
2 first phoneme consonantal
5 first phoneme voiced
21 second phoneme consonantal
43 third phoneme voiced
Table 2: Voice Feature Satisfied data set.

Corresponding Unit Feature location Meaning
2 first phoneme consonantal
5 first phoneme voiced
21 second phoneme consonantal Figure 2: improvements produced by interventions, in terms
24 second phoneme voiceless of reductions in sum-squared error, for interventions
40 third phoneme vowel occurring at different epochs between 50 and 250. Left
43 third phoneme voiced panel: effects immediately after the end of intervention.
Right panel: effects at the end of training (GA=GA feature
Results intervention set, V=voice satisfied intervention set;
TR=training set performance, NR=novel rhyme set,
The results of the intervention are shown in Table 3.
ST=shadow training set, AR=arbitrary, ID=identity,
Beginning with the early intervention condition, no reliable
VC=vowel change).
improvement was observed on the original training set at the
We then considered generalization, with the same design
end of intervention. If anything, intervention caused
but adding a 5-level factor of generalization set. For
performance on the training set to worsen. This is in line
performance at the immediate end of intervention,
with the view that normalization is difficult for a network
performance was reliably improved by the intervention,
with limited capacity. Compensation was assessed via 5
(F(1,9)=478.49, p<.001), with an effect size of ηp2=.98; the
novel verb sets assessing generalization of different
effect differed between intervention sets, with the GA
functions that might be extended from the original training
featured set having the larger effect (F(1,9)=31.68, p<.001,
set. The first two, novel rhyme and shadow training set,

126
ηp2=.78); improvement depended on the generalization set normalization and indeed the compensatory strategy (while
used (F(1,9)=7.64, p=.022, ηp2=.46); and the intervention effective) initially caused performance to further diverge
effect was not modulated by timing of intervention. The from the typical trajectory. Benefits of intervention were
results at the end of training were similar, but with a possible across a wide stretch of the developmental
2
reduced intervention effect size (ηp =.83), and now no trajectory, with little indication of reductions in plasticity
modulation depending on the type of intervention used. across the range of timing of interventions we considered.
In sum, in line with our expectations, compensatory However, early interventions showed dissipating effects
strategies were effectively encouraged via the addition of an across development once the intervention was discontinued,
intervention set. Intervention sets did not achieve with the exact type of intervention becoming less relevant.
Table 3: Intervention results. UN=untreated, GA=GA feature intervention set, V=voice satisfied intervention set. Scores
show performance of the network prior to intervention, immediately following an intervention lasting 30 epochs, and at the
end of training of 1000 epochs, for untreated networks and networks treated with each intervention set. Performance is
measured by sum-squared error, where lower numbers represent better performance and higher numbers represent worse
performance. Reliable treatment effects are marked. Interventions were at five time points, 50, 100, 150, 200 and 250 epochs.

Test data sets Average RMS errors returned by ANNs’ intervened at different time point
50th 100th 150th 200th 250th
50 80 1000 100 130 1000 150 180 1000 200 230 1000 250 280 1000
508 verbs UN 1.50 1.24 0.66 1.17 1.07 0.61 0.98 0.93 0.56 0.95 0.86 0.57 0.88 0.90 0.61
(training GA 1.50 1.38* 0.67 1.17 1.22* 0.61 0.98 1.05* 0.52 0.95 1.04* 0.59 0.88 0.98 0.59
data) V 1.50 1.46* 0.65 1.17 1.22* 0.60 0.98 1.05* 0.55 0.95 1.01* 0.58 0.88 1.02 0.65
572 novel UN 6.41 6.41 6.65 6.34 6.35 6.64 6.38 6.41 6.63 6.28 6.30 6.53 6.28 6.31 6.56
rhymes GA 6.41 6.08* 6.59 6.34 6.14* 6.61 6.38 6.10* 6.60 6.28 6.09* 6.47 6.28 6.09* 6.48
V 6.41 6.04* 6.59 6.34 6.14* 6.57 6.38 6.18* 6.54 6.28 6.10* 6.41 6.28 6.05* 6.44
508 shadow UN 7.34 7.39 8.22 7.40 7.50 8.26 7.56 7.67 8.21 7.68 7.70 8.27 7.81 7.84 8.33
training set GA 7.34 6.14* 7.93+ 7.40 6.25* 8.13+ 7.56 6.49* 7.91* 7.68 6.49* 8.00* 7.81 6.49* 8.07*
V 7.34 6.23* 7.97+ 7.40 6.38* 8.15 7.56 6.60* 8.00 7.68 6.71* 8.04* 7.81 6.81* 8.13*
508 arbitrary UN 18.47 18.56 18.77 18.58 18.62 18.81 18.60 18.69 18.88 18.64 18.66 18.86 18.70 18.74 18.88
irregular GA 18.47 18.63 18.84 18.58 18.66 18.83 18.60 18.77 18.91 18.64 18.72 18.85 18.70 18.79 18.96
verbs V 18.47 18.62 18.87 18.58 18.70 18.83 18.60 18.71 18.91 18.64 18.75 18.85 18.70 18.82 18.85
508 identical UN 5.36 5.38 6.00 5.43 5.45 6.00 5.53 5.68 5.98 5.47 5.52 5.96 5.63 5.69 6.00
irregular GA 5.36 4.57* 5.75 5.43 4.61* 5.77+ 5.53 4.83* 5.73 5.47 4.63* 5.72* 5.63 4.72 5.77
verbs V 5.36 4.67* 5.77 5.43 4.69* 5.89 5.53 4.86* 5.85 5.47 4.71* 5.80* 5.63 4.75 5.79
508 vowel UN 8.15 8.24 9.04 8.29 8.35 9.11 8.42 8.47 8.93 8.45 8.48 8.98 8.57 8.52 8.91
changed GA 8.15 7.62* 8.91 8.29 7.69* 8.86+ 8.42 7.71* 8.75 8.45 7.88* 8.83 8.57 7.88* 8.63+
irregular V 8.15 7.74* 8.99 8.29 7.80* 8.95 8.42 7.82* 8.76 8.45 8.07* 8.89 8.57 8.00* 8.74
verbs
* Independent t-test treated vs. untreated p<.05
+ Independent t-test treated vs. untreated p<.10
of an ANN model for a fixed period, to increase the salience
of certain regularities in the problem domain.
Discussion Our results demonstrated that, where a language deficit
In this work, we have sought to build on successful research arises due to limitations in processing capacity,
using ANNs to simulate atypical cognitive and language compensation (optimization on a subset of the problem
development, to consider implications for behavioral domain) is more readily achievable than normalization
interventions to remediate developmental deficits. We (improvement on the whole problem domain), and the
focused on the domain of past tense formation, which has particular training items chosen to effect the compensation
been a target of intervention for children with grammatical can alter the size of the effect. Within the intervention
deficits (Ebbels, 2007; Kulkarni et al., 2014; Seeff-Gabriel window we considered, we found no reductions in
et al., 2012). Rather than a realistic model of these receptiveness of the ANN to remediation, indicating no
interventions, our goal here was more preliminary: to entrenchment or reductions in plasticity. However, benefits
explore methods for deriving possible intervention sets, to did dissipate once the intervention had ceased.
assess their impact on different areas of performance, to Returning to the target phenomenon, in reality, behavioral
assess the influence of timing of intervention, and to assess interventions to remediate developmental disorders of
the extent to which any gains were sustained following the language and cognition are multi-faceted. They are usually
cessation of intervention. However, we followed one of the interactional and social, and involve emotional and
broad tenets of an intervention called grammar facilitation, motivational factors in the child, as well as cognitive factors.
one of the most widely investigated methods for intervening There are myriad causes of variability in children’s abilities,
to address grammar deficits in school age children. In be they biological, psychological, environmental, or social –
grammar facilitation, the aim is to make target forms more factors that must be considered in planning preventions or
frequent, which is hypothesized to help the child identify interventions (Beauchaine, Neuhaus, Brenner & Gatzke-
grammatical rules and give them practice at producing Kopp, 2008). Clinical practice is driven by a range of
forms they tend to omit (Ebbels, 2014). In line with this principles including the emerging evidence base and the
view, our intervention added information to the training set therapeutic setting, as well as the child and family’s goals.
Within approaches targeting speech and language needs

127
directly, the clinician may form a hypothesis as to (i) the Kulkarni, A., Pring, T., & Ebbels, S. (2014). Evaluating the
nature of the difficulty and (ii) what will be optimally effectiveness of therapy based around Shape Coding to
effective for a child. The results of intervention will further develop the use of regular past tense morphemes in two
refine these hypotheses. children with language impairments. Child Language
Nevertheless, the quality of neurocomputational Teaching & Therapy, 30(3), 245-254.
mechanisms of learning and development is a key Law, J., Campbell, C., Roulstone, S., Adams, C. & Boyle, J.
constraining factor, given that these mechanisms underlie (2007). Mapping practice onto theory: The speech and
behavior, and given that their plasticity is crucial in language practitioner’s construction of receptive language
achieving remediation. We believe there is value in impairment. International Journal of Language and
computational modeling work to further understand the Communication Disorders, 43, 245–63.
mechanistic basis of atypical development and how deficits Marchman, V. A. (1993). Constraints on plasticity in a
might be remediated by behavioral means. connectionist model of the English past tense. Journal of
Cognitive Neuroscience, 5, 215-234.
Acknowledgments Mareschal, D. & Thomas M. S. C. (2007) Computational
This research is supported by the National Natural Science modeling in developmental psychology. IEEE
Foundation of China (61402309) and UK ESRC grant RES- Transactions on Evolutionary Computation (Special Issue
062-23-2721. on Autonomous Mental Development), 11, 137-150.
Onnis, L., Monaghan, P., Christiansen, M., & Chater, N.
(2005). Variability is the spice of learning, and a crucial
References ingredient for detecting and generalizing in nonadjacent
Abel, S., Willmes, K. & Huber, W. (2007). Model-oriented dependencies. In: K. Forbus, D. Gentner & T. Regier
naming therapy: Testing predictions of a connectionist (Eds.), Proceedings of the 26th Annual Conference of the
model. Aphasiology, 21(5), 411-447. Cognitive Science Society (pp. 1047-1052). Mahwah, NJ:
Beauchaine, T. P., Neuhaus, E., Brenner, S. L., & Gatzke- Erlbaum.
Kopp, L. (2008). Ten good reasons to consider biological Plaut, D.C. (1996). Relearning after damage in connectionist
processes in prevention and intervention research. networks: Toward a theory of rehabilitation. Brain and
Development and Psychopathology, 20, 745-774. Language, 52, 25-82.
Best, W., Fedor, A., Hughes, L., Kapikian, A., Masterson, J., Plunkett ， K. ， & Marchman, V. (1991). U-shaped
Roncoli, S., Fern-Pollak, L., & Thomas, M. S. C. (2015). learning and frequency effects in a multi-layered
Intervening to alleviate word-finding difficulties in perception: Implications for child language acquisition.
children: Case series data and a computational modelling Cognition, 38, 43-102.
foundation. Cognitive Neuropsychology. Article first Poll, G. H. (2011). Increasing the odds: Applying
published online: 25 FEB 2015, doi: emergentist theory in language intervention. Language,
10.1080/02643294.2014.1003204 Speech, and Hearing Services in Schools, 42, 580-591.
Borovsky, A. & Elman, J. L. (2006). Language input and Seeff-Gabriel, B., Chiat, S., & Pring, T. (2012). Intervention
semantic categories: a relation between cognition and for co-occurring speech and language difficulties. Child
early word learning. Journal of Child Language, 33(4), Language Teaching & Therapy, 20, 123-35.
759-790. Thomas, M. S. C. (2005). Characterising compensation.
Daniloff, R. G. (2002). Connectionist approaches to clinical Cortex, 41(3), 434-442.
problems in speech and language. Erlbaum: Mahwah, NJ Thomas, M. S. C., Forrester, N. A., & Ronald, A. (2015).
Ebbels, S. (2007). Teaching grammar to school-aged Multiscale modeling of gene–behavior associations in an
children with specific language impairment using Shape artificial neural network model of cognitive development.
Coding. Child Language Teaching & Therapy, 23, 67-93. Cognitive Science. Article first published online: 3 APR
Ebbels, S. (2014). Effectiveness of intervention for grammar 2015, doi: 10.1111/cogs.12230
in school-aged children with primary language deficits. Thomas, M. S. C. & Johnson, M. H. (2006). The
Child Language Teaching & Therapy, 30(1), 7-40. computational modelling of sensitive periods.
Fedor, A., Best, W., Masterson, J., & Thomas, M. S. C. Developmental Psychobiology, 48(4), 337-344.
(2013). Towards identifying principles for clinical Thomas, M. S. C. & Karmiloff-Smith, A. (2003). Modeling
intervention in developmental language disorders from a language acquisition in atypical phenotypes.
neurocomputational perspective. DNLTechreport2013-1 Psychological Review, 110, 647-682.
(www.psyc.bbk.ac.uk/research/DNL) Thomas, M. S. C. & Knowland, V. C. P. (2014). Modelling
Gomez, R. L. (2005), Dynamically guided learning. In M. mechanisms of persisting and resolving delay in language
Johnson & Y. Munakata (Eds.) Attention and development. Journal of Speech, Language, and Hearing
Performance XXI (pp. 87—110). Oxford: OUP. Research, 57(2), 467-483
Harm, M. W., McCandliss, B. D. & Seidenberg, M. S.
(2003). Modeling the successes and failures of
interventions for disabled readers. Scientific Studies of
Reading, 7, 155-182.

128