=Paper= {{Paper |id=Vol-1419/paper0016 |storemode=property |title=Simulating Intervention to Support Compensatory Strategies in an Artificial Neural Network Model of Atypical Language Development |pdfUrl=https://ceur-ws.org/Vol-1419/paper0016.pdf |volume=Vol-1419 |dblpUrl=https://dblp.org/rec/conf/eapcogsci/YangT15 }} ==Simulating Intervention to Support Compensatory Strategies in an Artificial Neural Network Model of Atypical Language Development== https://ceur-ws.org/Vol-1419/paper0016.pdf
  Simulating intervention to support compensatory strategies in an artificial neural
                  network model of atypical language development
                                             Juan Yang (jkxy_yjuan@sicnu.edu.cn)
                                   Department of Computer Science, Sichuan Normal University
                                                   Chengdu, 610101 China
                                        Michael S. C. Thomas (m.thomas@bbk.ac.uk)
                           Developmental Neurocognition Lab, Department of Psychological Sciences
                                           Birkbeck, University of London, UK



                            Abstract                                       to address a developmental deficit (in Harm et al.’s 2003
                                                                           reading model; Harm, McCandliss & Seidenberg, 2003).
  Artificial neural networks have been used to model
  developmental deficits in cognitive and language                         Slightly more work has used ANN models to investigate
  development, most often by including sub-optimal input-                  remediation following acquired damage. For example, in a
  output representations or computational parameters in these              model of acquired dyslexia, Plaut (1996) considered the
  learning systems. The next step is to simulate intervention to           degree and speed of recovery through retraining, the extent
  alleviate developmental impairments, to inform the                       to which improvement on treated items generalizes to
  mechanistic basis of remediation. Here we used a sample                  untreated items, and how treated items are selected to
  model of atypical language development (in the well-explored
  domain of past tense acquisition) to investigate the extent to
                                                                           maximize this generalization. Abel et al. (2007)
  which alternative training regimes may induce short-term or              demonstrated how an adult model of aphasia could guide
  long-term compensatory changes in underlying function, and               actual interventions depending on patients’ error patterns.
  the extent to which this depends on the timing of intervention.          Even here, however, the work remains limited.
  We present a new method to derive ‘intervention’ training                   The computational approach to development has
  sets as a simulation of behavioral interventions, and assess its         generated a growing understanding of environmental factors
  adequacy in our sample model.                                            that influence learning in typical development (Borovsky &
  Keywords: language development; developmental disorders;                 Elman, 2006; Gomez, 2005; Onnis et al., 2005). This
  artificial neural network models; intervention; compensation             includes the importance of factors such as the frequency of
                                                                           training items, their similarity and variability, and the
                        Introduction                                       provision of novelty in familiar contexts. However, there
Computational models of development, particularly those                    has yet to be a consideration of how these factors interact
employing artificial neural networks (ANN), have provided                  with learning systems containing the sorts of atypical
hypotheses about the mechanistic bases of cognitive and                    computational constraints that lead to impoverished internal
language deficits (Mareschal & Thomas, 2007). For                          representations and, in turn, behavioral deficits compared to
example, in the domain of language, Harm, McCandliss and                   typically developing children. It is yet a further step to link
Seidenberg (2003) demonstrated how limited connectivity                    such an understanding with the diverse activities that tend to
in the phonology component of a reading model produced a                   be used by clinicians in speech and language therapy,
system with symptoms of dyslexia. In a model of                            including such activities as modeling, forced alternatives,
inflectional morphology, Thomas (2005) demonstrated how                    repetition, visual approaches to support oral language, and
shallow sigmoid activation functions yielded processing                    reducing distractions (Ebbels, 2014; Law et al., 2007).
units that were insensitive to small changes in the input,                    From the perspective of individual network models
producing networks that exhibited developmental delay.                     simulating development, where development is conceived of
More recently, Thomas and Knowland (2014) considered                       as the acquisition of the domain instantiated in the learning
how multiple changes to intrinsic computational properties                 environment, it is not obvious that ‘behavioral interventions’
and extrinsic environmental factors could produce different                could alleviate a developmental impairment that arises
types of language delay that either persisted or resolved over             either through inadequate representations or insufficient
developmental time.                                                        computational power. Here, we conceive of a behavioral
  Progress of this type motivated Daniloff (2002, p.viii), in              intervention as representing the addition of some further
his book Connectionist approaches to clinical problems in                  training items to the normal training set of the network
speech and language, to comment ‘ANN theory will …                         model. If the model is unable to learn the training set to a
form the backbone of much of language therapy in the near                  given performance level through limitations in processing
future’. However, research and practice have yet to repay                  capacity, adding further input-output mappings to the
this optimism (though see Poll, 2011, for renewed attempts                 training set is unlikely to enhance performance on the
to make these links). Only one computational study has                     patterns comprising the original training set. What one
systematically explored the efficacy of a single intervention              might call normalization through behavioral intervention is



                                                                     123
therefore difficult if one conceives of developmental deficits                                 Base Model
as arising from limitations in individual systems. We define
                                                                        We employed a simple simulation of past tense formation
normalization as the acquisition of the abilities and
                                                                        (Plunkett & Marchman, 1993) as our base model. This
knowledge that any typically developing system acquires
                                                                        model uses an ANN to learn a quasi-regular mapping
through exposure to the normal training set.
                                                                        problem instantiated in an artificial language design to have
   There are at least three possible responses to this
                                                                        many of the properties of English past tense formation. The
difficulty in achieving normalization. First, the
                                                                        learning domain is predominantly characterized by a general
computational properties of the system might be enhanced
                                                                        rule (add –ed to a verb stem to form its past tense). However,
to enable it to acquire the training set (for instance, for the
                                                                        there exists a minority of exception or irregular verbs which
actual child, by interventions targeting motivation, or by
                                                                        form their past tense in different ways, for instance with
pharmacological means; for the network, by altering one or
                                                                        arbitrary associations between stem and past tense form (go-
more of its parameters).
                                                                        went), no-change irregulars where the past tense form is the
   Second, the intervention might target the input and output
                                                                        same as the verb stem (hit-hit), and vowel-change irregulars
representations of the system, thereby simplifying the
                                                                        (sing-sang, ring-rang). The network is required to learn the
computational problem that the network has to learn. Harm,
                                                                        association of verb stems to their past tense forms.
McCandliss       and      Seidenberg    (2003)’s     simulated
                                                                        Generalization can be tested on whether the network can
phonological intervention for developmental dyslexia
                                                                        apply the past tense rule to novel verbs, or can apply any of
utilized this method. Best et al. (2015) have recently used a
                                                                        the irregular patterns to novel verbs sharing similarity to
similar approach to simulate behavioral interventions for
                                                                        irregulars existing in the training set.
developmental deficits in productive vocabulary,
alternatively     targeting    phonological     or    semantic
                                                                        Simulating atypical development in the base model
representations that represent the two codes that must be
associated in vocabulary development.                                   Three layered ANNs were used to simulate individual
   Third, one might take the view that what the atypical                children. All the ANNs had 57 input units in the input layer
system needs to learn is not the training set per se (even              and 62 output units in the output layer, used to represent tri-
though this is what typical systems acquire), but a general             phoneme verb stems and their past tense forms. The 57
function implicit in the items comprising the training set.             input units corresponded to the binary encoding of a
Acquisition of this general function can be assessed by                 monosyllabic three-phoneme verb, where each phoneme
performance on generalization sets rather than the training             was represented using 19 binary articulatory features. The
set. There may then be input-output mappings that can be                output included the same binary encoding of the tri-
added to the training set which could improve the network’s             phoneme string with the addition 5 extra bits to represent
ability to learn the general function, even if performance on           the suffix part of the past tense. The encoding is based on
the original training set did not improve (or even worsened).           one that proposed by Plunkett & Marchman (1991; P&M91)
One might term this approach compensation, since the aim                (see Thomas & Karmiloff-Smith, 2003, for more details of
is to optimize a subset of behaviors present in the original            the artificial language, including the consonant-vowel
training set.                                                           templates used to generate the artificial verbs).
   In this paper, we investigate possible ‘behavioral
interventions’ to encourage compensation (so defined) in a
widely used model drawn from the domain of language
development, that of English past tense formation. This
model has been used to capture developmental trajectories
and error patterns as children acquire English verb
morphology, but it has also been used as a sample
associative system to consider more general issues in
development (see, e.g., use of this model to investigate
sensitive periods development: Marchman, 1993; Thomas &
Johnson, 2006; to investigate population-level individual
differences: Thomas, Forrester & Ronald, 2015). We                      Figure 1: An ‘atypical’ ANN caused by a capacity deficit of
introduce a method to derive ‘intervention patterns’ that are               reducing the number of hidden units from 50 to 15.
added to the training set of atypical networks for a limited
period in development, simulating an intervention given to a               The training set comprised 508 artificial verbs, 410
child diagnosed with a developmental impairment. We                     regular verbs, 20 identical irregular verbs, 10 arbitrary
compare the effectiveness of two different intervention sets            irregular verbs and 68 vowel changed irregular verbs.
for improving generalization performance on several                     Developmental trajectories were simulated by 1000
possible implicit functions that the network might be able to           presentations of this training set (epochs). A network with
acquire.                                                                50 hidden units (back propagation algorithm, learning rate
                                                                        0.1, momentum 0, temperature 1, initial weights randomized
                                                                        between ±1) proved able to learn the training set within


                                                                  124
approximately 300 epochs. We implemented a                                 verbs, 10 arbitrary irregular verbs, and 68 vowel-change
developmental deficit by reducing the computational                        irregular verbs.
capacity of the network (Thomas & Knowland, 2014).                     • A novel set of 508 arbitrary irregular verbs generated
Piloting indicated that a reduction of hidden unit resources               using the P&M91 templates.
to 15 produced a persisting deficit in learning the training           • A novel set of 508 no-change irregular verbs generated
set (architecture show in Figure 1).                                       using the P&M91 templates.
                                                                       • A novel set of 508 vowel-change irregular verbs
Simulating interventions to encourage                                      generated using the P&M91 templates.
compensation in the atypical network                                   For the novel rhyme set, generalization was assessed
We simulated a behavioral intervention to remediate the                according to accuracy of producing regularly inflected
developmental impairment in the following way. We                      forms. For irregular verbs in the shadow training set, and
assumed that the impairment was diagnosed at some point                for the three novel irregular sets, generalization was tested
relatively early in development. At this time, additional              according to accuracy of producing the target irregular
patterns were added to the original training set. The                  output form.
intervention set was added to, rather than replaced, the                 We asked two questions. Did the intervention to
original training set, since we assume that in a clinical              encourage a compensatory strategy produce any benefit at
setting, interventions take place against the child’s                  the immediate end of the intervention period? And if so, did
continued experience of his or her normal learning                     any benefit persist after the intervention ceased so that it
environment. In ANNs, replacement would also incur the                 was observable at the end of training? For the earliest
risk of catastrophic interference. We assumed that the                 intervention, performance was therefore assessed at 80 and
behavioral intervention was much smaller in scale than                 1000 epochs. For the latest intervention, performance was
continued everyday experience, and so limited the                      assessed at 280 and 1000 epochs. In each case, there were
intervention set to 10% the size of the original training set          10 replications of networks with different random seeds.
(50 input-output mappings versus 508 in the original set).               This leaves the challenge of how to construct an
Intervention continued for a limited duration (30 epochs of            intervention set to encourage a compensatory strategy. In
training), at which point the intervention ceased and training         the next section, we propose a method.
reverted to the original set. Intervention had the goal of
encouraging acquisition of the regular past tense rule.                A method for generating intervention sets for
   We manipulated the timing of intervention, from ‘early’ at          compensation
50 epochs in steps of 50 up to ‘late’ at 250 epochs (i.e., 5           The problem we needed to solve is how to choose the most
stages: 50, 100, 150, 200, and 250) compared to the full               effective 50 intervention verbs among the hundreds of
training trajectory of 1000 epochs. The importance of early            thousands of possible artificial verbs possible within the
intervention has been stressed within a clinical setting,              encoding scheme. The idea is intuitive: if we suppose some
under the view that plasticity reduces over time. Simple               of the input units are more important and decisive than
feedforward ANNs have been claimed to capture a                        others, then the intervention verbs can be chosen based on
reduction in plasticity through entrenchment effects                   these features. Now the problem becomes how to identify
(Marchman, 1993), though a broader set of mechanisms                   the key features within the original training set among the
may also produce reductions in plasticity, such as synaptic            original 57 dimensional input space. The extra data set
pruning (Thomas & Johnson, 2006).                                      should be able to remedy a disordered ANN in its
   We assessed normalization with respect to changes in                generalization of the past tense rule. Broadly, the approach
performance on the original training set. We assessed                  we adopted was to take an ANN that had successfully
compensation with respect to changes in performance on                 learned the past tense problem. We then varied the
five generalization sets. These were:                                  activation level of input units and assessed the extent to
• A novel rhyme set. Each novel verb shared two out of                 which this might generate the error on the output. This
    three phonemes with a verb in the training set. There              should indicate the extent which input units encoded key
    were 410 regular verb rhymes, 20 no-change irregular               dimensions would influence the performance of learning.
    verb rhymes, 10 arbitrary irregular verb rhymes, and 76            Formally, we translated this challenge into an optimization
    vowel change irregular verb rhymes. Finally, there were            problem, specified as:
    56 novel verbs only shared one phoneme with any verb               ( (  −   , ∑   = ,  = 0  1(1)
    in the training set. This novel verb set has been used in
    previous simulations (e.g., Thomas & Karmiloff-Smith,              In Formula (1), Y is the past tense matrix in the training data
    2003).                                                             set, while  (  is the actual output of the ANN,  is
• A shadow training set. These were novel artificial verbs             the number of the final layer of the ANN. So, Formula (1)
    regenerated using the same consonant-vowel templates               attempts to select out the input units that contribute most to
    as the original training set (P&M91) and in the same               the learning based on the training data set.   is a
    proportions: 410 regular verbs, 20 no-change irregular             recursive function defined in Formula (2):




                                                                 125
                         (  , !" , # = 1
                =                              (
                                                                          contain significant numbers of regular verbs which one
                         $ %" , ! &, # > 1
                                                   (2)                    might expect to aided by the intervention. In both cases,
Since this is a combinatorial optimization problem, we used               statistically reliable benefits of intervention were observed.
a Genetic Algorithm (GA) approach to find the optimized                   Three sets considered the possibility of generalizing
result. In this algorithm,  = 25 features. However, after the            irregular patterns. Since there is no systematic relation
GA filtered out these features, a further selection was                   between arbitrary mappings, one would not expect an
necessary, since no artificial verbs could fully satisfied the            intervention effect on novel arbitrary verbs, and none was
filtered features. In the final step, a subset of 5 or 6 features         found. However, both novel no-change and novel vowel-
were chosen to generate two possible intervention data sets.              change generalization sets showed benefits. This implies
                                                                          that the intervention had better enabled the atypical network
Key features selected to generate the intervention                        to separate regular and irregular mappings within its
data sets                                                                 representational space, and so generalize both types of
                                                                          general function to novel verbs with features that would
After running the GA, two sets of features were constructed.              support these functions.
We refer to the first as the GA feature set, since it was                    We ran a series of omnibus ANOVAs to assess broader
closest to a shortened version of the original filtered 25                patterns. To emphasize the possibility that timing of
features yet consistent with legal verbs within the P&M91                 intervention might have an effect, we focused on a
encoding scheme. Novel verbs each contained 5 selected                    comparison between the earliest intervention point (50
features shown in Table 1. We refer to the second as the                  epochs) and latest (250 epochs). Figure 2 demonstrates the
Voice satisfied feature set. Novel verbs each contained 6                 effect of intervention at intermediate time points. We first
selected features shown in Table 2. These verbs were more                 examined training set performance, considering factors of
consistent with those present in the original training set in             group (treated vs. untreated), intervention type (GA vs. V),
terms of their voicing features. One might think of the first             and timing (50 vs. 250 epochs), separately for the immediate
intervention set as optimized but somewhat strange, and the               end of intervention and at the end of training. For
second as slightly less optimized but more natural given the              performance at the immediate end of intervention, there was
ANNs previous training experiences.                                       a reliable effect of the intervention (F(1,9)=12.96, p=.006),
  Fifty novel verbs were created for each intervention set.               with an effect size of ηp2=.59, corresponding to a worsening
The target output for each novel verb was the regular                     of performance. The intervention effect was not modulated
inflected past tense.                                                     by intervention type, nor by timing of intervention. For
          Table 1: GA Featured intervention data set.                     performance at the end of training, there was no effect of the
                                                                          intervention at 1000 epochs.
Corresponding Unit    Feature location       Meaning
1                     first phoneme          sonorant
2                     first phoneme          consonantal
5                     first phoneme          voiced
21                    second phoneme         consonantal
43                    third phoneme          voiced
           Table 2: Voice Feature Satisfied data set.

Corresponding Unit    Feature location       Meaning
2                     first phoneme          consonantal
5                     first phoneme          voiced
21                    second phoneme         consonantal                  Figure 2: improvements produced by interventions, in terms
24                    second phoneme         voiceless                        of reductions in sum-squared error, for interventions
40                    third phoneme          vowel                           occurring at different epochs between 50 and 250. Left
43                    third phoneme          voiced                          panel: effects immediately after the end of intervention.
                                                                           Right panel: effects at the end of training (GA=GA feature
                           Results                                             intervention set, V=voice satisfied intervention set;
                                                                               TR=training set performance, NR=novel rhyme set,
The results of the intervention are shown in Table 3.
                                                                               ST=shadow training set, AR=arbitrary, ID=identity,
Beginning with the early intervention condition, no reliable
                                                                                               VC=vowel change).
improvement was observed on the original training set at the
                                                                            We then considered generalization, with the same design
end of intervention. If anything, intervention caused
                                                                          but adding a 5-level factor of generalization set. For
performance on the training set to worsen. This is in line
                                                                          performance at the immediate end of intervention,
with the view that normalization is difficult for a network
                                                                          performance was reliably improved by the intervention,
with limited capacity. Compensation was assessed via 5
                                                                          (F(1,9)=478.49, p<.001), with an effect size of ηp2=.98; the
novel verb sets assessing generalization of different
                                                                          effect differed between intervention sets, with the GA
functions that might be extended from the original training
                                                                          featured set having the larger effect (F(1,9)=31.68, p<.001,
set. The first two, novel rhyme and shadow training set,



                                                                    126
ηp2=.78); improvement depended on the generalization set        normalization and indeed the compensatory strategy (while
used (F(1,9)=7.64, p=.022, ηp2=.46); and the intervention       effective) initially caused performance to further diverge
effect was not modulated by timing of intervention. The         from the typical trajectory. Benefits of intervention were
results at the end of training were similar, but with a         possible across a wide stretch of the developmental
                                      2
reduced intervention effect size (ηp =.83), and now no          trajectory, with little indication of reductions in plasticity
modulation depending on the type of intervention used.          across the range of timing of interventions we considered.
   In sum, in line with our expectations, compensatory          However,     early interventions showed dissipating effects
strategies were effectively encouraged via the addition of an   across development once the intervention was discontinued,
intervention set. Intervention sets did not achieve             with the exact type of intervention becoming less relevant.
Table 3: Intervention results. UN=untreated, GA=GA feature intervention set, V=voice satisfied intervention set. Scores
show performance of the network prior to intervention, immediately following an intervention lasting 30 epochs, and at the
end of training of 1000 epochs, for untreated networks and networks treated with each intervention set. Performance is
measured by sum-squared error, where lower numbers represent better performance and higher numbers represent worse
performance. Reliable treatment effects are marked. Interventions were at five time points, 50, 100, 150, 200 and 250 epochs.

    Test data sets              Average RMS errors returned by ANNs’ intervened at different time point
                                50th                         100th                          150th                     200th                      250th
                             50        80       1000      100      130       1000        150       180     1000    200        230     1000    250        280     1000
   508 verbs         UN      1.50      1.24     0.66      1.17     1.07      0.61        0.98      0.93    0.56    0.95       0.86    0.57    0.88       0.90    0.61
   (training         GA      1.50      1.38*    0.67      1.17     1.22*     0.61        0.98      1.05*   0.52    0.95       1.04*   0.59    0.88       0.98    0.59
     data)           V       1.50      1.46*    0.65      1.17     1.22*     0.60        0.98      1.05*   0.55    0.95       1.01*   0.58    0.88       1.02    0.65
   572 novel         UN      6.41      6.41     6.65      6.34     6.35      6.64        6.38      6.41    6.63    6.28       6.30    6.53    6.28       6.31    6.56
    rhymes           GA      6.41      6.08*    6.59      6.34     6.14*     6.61        6.38      6.10*   6.60    6.28       6.09*   6.47    6.28       6.09*   6.48
                     V       6.41      6.04*    6.59      6.34     6.14*     6.57        6.38      6.18*   6.54    6.28       6.10*   6.41    6.28       6.05*   6.44
  508 shadow         UN      7.34      7.39     8.22      7.40     7.50      8.26        7.56      7.67    8.21    7.68       7.70    8.27    7.81       7.84    8.33
  training set       GA      7.34      6.14*    7.93+     7.40     6.25*     8.13+       7.56      6.49*   7.91*   7.68       6.49*   8.00*   7.81       6.49*   8.07*
                     V       7.34      6.23*    7.97+     7.40     6.38*     8.15        7.56      6.60*   8.00    7.68       6.71*   8.04*   7.81       6.81*   8.13*
 508 arbitrary       UN      18.47     18.56    18.77     18.58    18.62     18.81       18.60     18.69   18.88   18.64      18.66   18.86   18.70      18.74   18.88
    irregular        GA      18.47     18.63    18.84     18.58    18.66     18.83       18.60     18.77   18.91   18.64      18.72   18.85   18.70      18.79   18.96
      verbs          V       18.47     18.62    18.87     18.58    18.70     18.83       18.60     18.71   18.91   18.64      18.75   18.85   18.70      18.82   18.85
  508 identical      UN      5.36      5.38     6.00      5.43     5.45      6.00        5.53      5.68    5.98    5.47       5.52    5.96    5.63       5.69    6.00
    irregular        GA      5.36      4.57*    5.75      5.43     4.61*     5.77+       5.53      4.83*   5.73    5.47       4.63*   5.72*   5.63       4.72    5.77
      verbs          V       5.36      4.67*    5.77      5.43     4.69*     5.89        5.53      4.86*   5.85    5.47       4.71*   5.80*   5.63       4.75    5.79
    508 vowel        UN      8.15      8.24     9.04      8.29     8.35      9.11        8.42      8.47    8.93    8.45       8.48    8.98    8.57       8.52    8.91
     changed         GA      8.15      7.62*    8.91      8.29     7.69*     8.86+       8.42      7.71*   8.75    8.45       7.88*   8.83    8.57       7.88*   8.63+
    irregular        V        8.15         7.74*    8.99   8.29      7.80*     8.95      8.42     7.82*    8.76    8.45       8.07*   8.89    8.57       8.00*   8.74
      verbs
 * Independent t-test treated vs. untreated p<.05
 + Independent t-test treated vs. untreated p<.10
                                                                                            of an ANN model for a fixed period, to increase the salience
                                                                                            of certain regularities in the problem domain.
                                   Discussion                                                 Our results demonstrated that, where a language deficit
In this work, we have sought to build on successful research                                arises due to limitations in processing capacity,
using ANNs to simulate atypical cognitive and language                                      compensation (optimization on a subset of the problem
development, to consider implications for behavioral                                        domain) is more readily achievable than normalization
interventions to remediate developmental deficits. We                                       (improvement on the whole problem domain), and the
focused on the domain of past tense formation, which has                                    particular training items chosen to effect the compensation
been a target of intervention for children with grammatical                                 can alter the size of the effect. Within the intervention
deficits (Ebbels, 2007; Kulkarni et al., 2014; Seeff-Gabriel                                window we considered, we found no reductions in
et al., 2012). Rather than a realistic model of these                                       receptiveness of the ANN to remediation, indicating no
interventions, our goal here was more preliminary: to                                       entrenchment or reductions in plasticity. However, benefits
explore methods for deriving possible intervention sets, to                                 did dissipate once the intervention had ceased.
assess their impact on different areas of performance, to                                     Returning to the target phenomenon, in reality, behavioral
assess the influence of timing of intervention, and to assess                               interventions to remediate developmental disorders of
the extent to which any gains were sustained following the                                  language and cognition are multi-faceted. They are usually
cessation of intervention. However, we followed one of the                                  interactional and social, and involve emotional and
broad tenets of an intervention called grammar facilitation,                                motivational factors in the child, as well as cognitive factors.
one of the most widely investigated methods for intervening                                 There are myriad causes of variability in children’s abilities,
to address grammar deficits in school age children. In                                      be they biological, psychological, environmental, or social –
grammar facilitation, the aim is to make target forms more                                  factors that must be considered in planning preventions or
frequent, which is hypothesized to help the child identify                                  interventions (Beauchaine, Neuhaus, Brenner & Gatzke-
grammatical rules and give them practice at producing                                       Kopp, 2008). Clinical practice is driven by a range of
forms they tend to omit (Ebbels, 2014). In line with this                                   principles including the emerging evidence base and the
view, our intervention added information to the training set                                therapeutic setting, as well as the child and family’s goals.
                                                                                            Within approaches targeting speech and language needs


                                                                                      127
directly, the clinician may form a hypothesis as to (i) the             Kulkarni, A., Pring, T., & Ebbels, S. (2014). Evaluating the
nature of the difficulty and (ii) what will be optimally                  effectiveness of therapy based around Shape Coding to
effective for a child. The results of intervention will further           develop the use of regular past tense morphemes in two
refine these hypotheses.                                                  children with language impairments. Child Language
  Nevertheless, the quality of neurocomputational                         Teaching & Therapy, 30(3), 245-254.
mechanisms of learning and development is a key                         Law, J., Campbell, C., Roulstone, S., Adams, C. & Boyle, J.
constraining factor, given that these mechanisms underlie                 (2007). Mapping practice onto theory: The speech and
behavior, and given that their plasticity is crucial in                   language practitioner’s construction of receptive language
achieving remediation. We believe there is value in                       impairment. International Journal of Language and
computational modeling work to further understand the                     Communication Disorders, 43, 245–63.
mechanistic basis of atypical development and how deficits              Marchman, V. A. (1993). Constraints on plasticity in a
might be remediated by behavioral means.                                  connectionist model of the English past tense. Journal of
                                                                          Cognitive Neuroscience, 5, 215-234.
                   Acknowledgments                                      Mareschal, D. & Thomas M. S. C. (2007) Computational
This research is supported by the National Natural Science                modeling in         developmental psychology. IEEE
Foundation of China (61402309) and UK ESRC grant RES-                     Transactions on Evolutionary Computation (Special Issue
062-23-2721.                                                              on Autonomous Mental Development), 11, 137-150.
                                                                        Onnis, L., Monaghan, P., Christiansen, M., & Chater, N.
                                                                          (2005). Variability is the spice of learning, and a crucial
                        References                                        ingredient for detecting and generalizing in nonadjacent
Abel, S., Willmes, K. & Huber, W. (2007). Model-oriented                  dependencies. In: K. Forbus, D. Gentner & T. Regier
  naming therapy: Testing predictions of a connectionist                  (Eds.), Proceedings of the 26th Annual Conference of the
  model. Aphasiology, 21(5), 411-447.                                     Cognitive Science Society (pp. 1047-1052). Mahwah, NJ:
Beauchaine, T. P., Neuhaus, E., Brenner, S. L., & Gatzke-                 Erlbaum.
  Kopp, L. (2008). Ten good reasons to consider biological              Plaut, D.C. (1996). Relearning after damage in connectionist
  processes in prevention and intervention research.                      networks: Toward a theory of rehabilitation. Brain and
  Development and Psychopathology, 20, 745-774.                           Language, 52, 25-82.
Best, W., Fedor, A., Hughes, L., Kapikian, A., Masterson, J.,           Plunkett , K. , & Marchman, V. (1991). U-shaped
  Roncoli, S., Fern-Pollak, L., & Thomas, M. S. C. (2015).                learning and frequency effects in a multi-layered
  Intervening to alleviate word-finding difficulties in                   perception: Implications for child language acquisition.
  children: Case series data and a computational modelling                Cognition, 38, 43-102.
  foundation. Cognitive Neuropsychology. Article first                  Poll, G. H. (2011). Increasing the odds: Applying
  published      online:     25      FEB       2015,     doi:             emergentist theory in language intervention. Language,
  10.1080/02643294.2014.1003204                                           Speech, and Hearing Services in Schools, 42, 580-591.
Borovsky, A. & Elman, J. L. (2006). Language input and                  Seeff-Gabriel, B., Chiat, S., & Pring, T. (2012). Intervention
  semantic categories: a relation between cognition and                   for co-occurring speech and language difficulties. Child
  early word learning. Journal of Child Language, 33(4),                  Language Teaching & Therapy, 20, 123-35.
  759-790.                                                              Thomas, M. S. C. (2005). Characterising compensation.
Daniloff, R. G. (2002). Connectionist approaches to clinical              Cortex, 41(3), 434-442.
  problems in speech and language. Erlbaum: Mahwah, NJ                  Thomas, M. S. C., Forrester, N. A., & Ronald, A. (2015).
Ebbels, S. (2007). Teaching grammar to school-aged                        Multiscale modeling of gene–behavior associations in an
  children with specific language impairment using Shape                  artificial neural network model of cognitive development.
  Coding. Child Language Teaching & Therapy, 23, 67-93.                   Cognitive Science. Article first published online: 3 APR
Ebbels, S. (2014). Effectiveness of intervention for grammar              2015, doi: 10.1111/cogs.12230
  in school-aged children with primary language deficits.               Thomas, M. S. C. & Johnson, M. H. (2006). The
  Child Language Teaching & Therapy, 30(1), 7-40.                         computational       modelling    of    sensitive    periods.
Fedor, A., Best, W., Masterson, J., & Thomas, M. S. C.                    Developmental Psychobiology, 48(4), 337-344.
  (2013). Towards identifying principles for clinical                   Thomas, M. S. C. & Karmiloff-Smith, A. (2003). Modeling
  intervention in developmental language disorders from a                 language      acquisition     in    atypical    phenotypes.
  neurocomputational perspective. DNLTechreport2013-1                     Psychological Review, 110, 647-682.
  (www.psyc.bbk.ac.uk/research/DNL)                                     Thomas, M. S. C. & Knowland, V. C. P. (2014). Modelling
Gomez, R. L. (2005), Dynamically guided learning. In M.                   mechanisms of persisting and resolving delay in language
  Johnson & Y. Munakata (Eds.) Attention and                              development. Journal of Speech, Language, and Hearing
  Performance XXI (pp. 87—110). Oxford: OUP.                              Research, 57(2), 467-483
Harm, M. W., McCandliss, B. D. & Seidenberg, M. S.
  (2003). Modeling the successes and failures of
  interventions for disabled readers. Scientific Studies of
  Reading, 7, 155-182.


                                                                  128