<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Simulating intervention to support compensatory strategies in an artificial neural network model of atypical language development</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Juan Yang (jkxy_yjuan@sicnu.edu.cn)</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, Sichuan Normal University Chengdu</institution>
          ,
          <addr-line>610101</addr-line>
          <country>China Michael S. C. Thomas</country>
        </aff>
      </contrib-group>
      <fpage>123</fpage>
      <lpage>128</lpage>
      <abstract>
        <p>Artificial neural networks have been used to model developmental deficits in cognitive and language development, most often by including sub-optimal inputoutput representations or computational parameters in these learning systems. The next step is to simulate intervention to alleviate developmental impairments, to inform the mechanistic basis of remediation. Here we used a sample model of atypical language development (in the well-explored domain of past tense acquisition) to investigate the extent to which alternative training regimes may induce short-term or long-term compensatory changes in underlying function, and the extent to which this depends on the timing of intervention. We present a new method to derive 'intervention' training sets as a simulation of behavioral interventions, and assess its adequacy in our sample model.</p>
      </abstract>
      <kwd-group>
        <kwd>language development</kwd>
        <kwd>developmental disorders</kwd>
        <kwd>artificial neural network models</kwd>
        <kwd>intervention</kwd>
        <kwd>compensation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Computational models of development, particularly those
employing artificial neural networks (ANN), have provided
hypotheses about the mechanistic bases of cognitive and
language deficits
        <xref ref-type="bibr" rid="ref14">(Mareschal &amp; Thomas, 2007)</xref>
        . For
example, in the domain of language, Harm, McCandliss and
Seidenberg (2003) demonstrated how limited connectivity
in the phonology component of a reading model produced a
system with symptoms of dyslexia. In a model of
inflectional morphology,
        <xref ref-type="bibr" rid="ref20">Thomas (2005)</xref>
        demonstrated how
shallow sigmoid activation functions yielded processing
units that were insensitive to small changes in the input,
producing networks that exhibited developmental delay.
More recently,
        <xref ref-type="bibr" rid="ref24">Thomas and Knowland (2014)</xref>
        considered
how multiple changes to intrinsic computational properties
and extrinsic environmental factors could produce different
types of language delay that either persisted or resolved over
developmental time.
      </p>
      <p>
        Progress of this type motivated
        <xref ref-type="bibr" rid="ref5">Daniloff (2002</xref>
        , p.viii), in
his book Connectionist approaches to clinical problems in
speech and language, to comment ‘ANN theory will …
form the backbone of much of language therapy in the near
future’. However, research and practice have yet to repay
this optimism
        <xref ref-type="bibr" rid="ref18">(though see Poll, 2011, for renewed attempts
to make these links)</xref>
        . Only one computational study has
systematically explored the efficacy of a single intervention
to address a developmental deficit
        <xref ref-type="bibr" rid="ref10 ref10 ref23">(in Harm et al.’s 2003
reading model; Harm, McCandliss &amp; Seidenberg, 2003)</xref>
        .
Slightly more work has used ANN models to investigate
remediation following acquired damage. For example, in a
model of acquired dyslexia,
        <xref ref-type="bibr" rid="ref16">Plaut (1996)</xref>
        considered the
degree and speed of recovery through retraining, the extent
to which improvement on treated items generalizes to
untreated items, and how treated items are selected to
maximize this generalization.
        <xref ref-type="bibr" rid="ref1">Abel et al. (2007)</xref>
        demonstrated how an adult model of aphasia could guide
actual interventions depending on patients’ error patterns.
Even here, however, the work remains limited.
      </p>
      <p>
        The computational approach to development has
generated a growing understanding of environmental factors
that influence learning in typical development
        <xref ref-type="bibr" rid="ref15 ref4 ref9">(Borovsky &amp;
Elman, 2006; Gomez, 2005; Onnis et al., 2005)</xref>
        . This
includes the importance of factors such as the frequency of
training items, their similarity and variability, and the
provision of novelty in familiar contexts. However, there
has yet to be a consideration of how these factors interact
with learning systems containing the sorts of atypical
computational constraints that lead to impoverished internal
representations and, in turn, behavioral deficits compared to
typically developing children. It is yet a further step to link
such an understanding with the diverse activities that tend to
be used by clinicians in speech and language therapy,
including such activities as modeling, forced alternatives,
repetition, visual approaches to support oral language, and
reducing distractions
        <xref ref-type="bibr" rid="ref12 ref7">(Ebbels, 2014; Law et al., 2007)</xref>
        .
      </p>
      <p>From the perspective of individual network models
simulating development, where development is conceived of
as the acquisition of the domain instantiated in the learning
environment, it is not obvious that ‘behavioral interventions’
could alleviate a developmental impairment that arises
either through inadequate representations or insufficient
computational power. Here, we conceive of a behavioral
intervention as representing the addition of some further
training items to the normal training set of the network
model. If the model is unable to learn the training set to a
given performance level through limitations in processing
capacity, adding further input-output mappings to the
training set is unlikely to enhance performance on the
patterns comprising the original training set. What one
might call normalization through behavioral intervention is
therefore difficult if one conceives of developmental deficits
as arising from limitations in individual systems. We define
normalization as the acquisition of the abilities and
knowledge that any typically developing system acquires
through exposure to the normal training set.</p>
      <p>There are at least three possible responses to this
difficulty in achieving normalization. First, the
computational properties of the system might be enhanced
to enable it to acquire the training set (for instance, for the
actual child, by interventions targeting motivation, or by
pharmacological means; for the network, by altering one or
more of its parameters).</p>
      <p>
        Second, the intervention might target the input and output
representations of the system, thereby simplifying the
computational problem that the network has to learn. Harm,
McCandliss and Seidenberg (2003)’s simulated
phonological intervention for developmental dyslexia
utilized this method.
        <xref ref-type="bibr" rid="ref3">Best et al. (2015)</xref>
        have recently used a
similar approach to simulate behavioral interventions for
developmental deficits in productive vocabulary,
alternatively targeting phonological or semantic
representations that represent the two codes that must be
associated in vocabulary development.
      </p>
      <p>Third, one might take the view that what the atypical
system needs to learn is not the training set per se (even
though this is what typical systems acquire), but a general
function implicit in the items comprising the training set.
Acquisition of this general function can be assessed by
performance on generalization sets rather than the training
set. There may then be input-output mappings that can be
added to the training set which could improve the network’s
ability to learn the general function, even if performance on
the original training set did not improve (or even worsened).
One might term this approach compensation, since the aim
is to optimize a subset of behaviors present in the original
training set.</p>
      <p>
        In this paper, we investigate possible ‘behavioral
interventions’ to encourage compensation (so defined) in a
widely used model drawn from the domain of language
development, that of English past tense formation. This
model has been used to capture developmental trajectories
and error patterns as children acquire English verb
morphology, but it has also been used as a sample
associative system to consider more general issues in
development
        <xref ref-type="bibr" rid="ref13 ref21 ref22">(see, e.g., use of this model to investigate
sensitive periods development: Marchman, 1993; Thomas &amp;
Johnson, 2006; to investigate population-level individual
differences: Thomas, Forrester &amp; Ronald, 2015)</xref>
        . We
introduce a method to derive ‘intervention patterns’ that are
added to the training set of atypical networks for a limited
period in development, simulating an intervention given to a
child diagnosed with a developmental impairment. We
compare the effectiveness of two different intervention sets
for improving generalization performance on several
possible implicit functions that the network might be able to
acquire.
We employed a simple simulation of past tense formation
        <xref ref-type="bibr" rid="ref13">(Plunkett &amp; Marchman, 1993)</xref>
        as our base model. This
model uses an ANN to learn a quasi-regular mapping
problem instantiated in an artificial language design to have
many of the properties of English past tense formation. The
learning domain is predominantly characterized by a general
rule (add –ed to a verb stem to form its past tense). However,
there exists a minority of exception or irregular verbs which
form their past tense in different ways, for instance with
arbitrary associations between stem and past tense form
(gowent), no-change irregulars where the past tense form is the
same as the verb stem (hit-hit), and vowel-change irregulars
(sing-sang, ring-rang). The network is required to learn the
association of verb stems to their past tense forms.
Generalization can be tested on whether the network can
apply the past tense rule to novel verbs, or can apply any of
the irregular patterns to novel verbs sharing similarity to
irregulars existing in the training set.
      </p>
    </sec>
    <sec id="sec-2">
      <title>Simulating atypical development in the base model</title>
      <p>
        Three layered ANNs were used to simulate individual
children. All the ANNs had 57 input units in the input layer
and 62 output units in the output layer, used to represent
triphoneme verb stems and their past tense forms. The 57
input units corresponded to the binary encoding of a
monosyllabic three-phoneme verb, where each phoneme
was represented using 19 binary articulatory features. The
output included the same binary encoding of the
triphoneme string with the addition 5 extra bits to represent
the suffix part of the past tense. The encoding is based on
one that proposed by
        <xref ref-type="bibr" rid="ref17">Plunkett &amp; Marchman (1991</xref>
        ; P&amp;M91)
        <xref ref-type="bibr" rid="ref23">(see Thomas &amp; Karmiloff-Smith, 2003, for more details of
the artificial language, including the consonant-vowel
templates used to generate the artificial verbs)</xref>
        .
      </p>
      <p>Figure 1: An ‘atypical’ ANN caused by a capacity deficit of
reducing the number of hidden units from 50 to 15.</p>
      <p>
        The training set comprised 508 artificial verbs, 410
regular verbs, 20 identical irregular verbs, 10 arbitrary
irregular verbs and 68 vowel changed irregular verbs.
Developmental trajectories were simulated by 1000
presentations of this training set (epochs). A network with
50 hidden units (back propagation algorithm, learning rate
0.1, momentum 0, temperature 1, initial weights randomized
between ±1) proved able to learn the training set within
approximately 300 epochs. We implemented a
developmental deficit by reducing the computational
capacity of the network
        <xref ref-type="bibr" rid="ref24">(Thomas &amp; Knowland, 2014)</xref>
        .
Piloting indicated that a reduction of hidden unit resources
to 15 produced a persisting deficit in learning the training
set (architecture show in Figure 1).
      </p>
    </sec>
    <sec id="sec-3">
      <title>Simulating interventions to encourage compensation in the atypical network</title>
      <p>We simulated a behavioral intervention to remediate the
developmental impairment in the following way. We
assumed that the impairment was diagnosed at some point
relatively early in development. At this time, additional
patterns were added to the original training set. The
intervention set was added to, rather than replaced, the
original training set, since we assume that in a clinical
setting, interventions take place against the child’s
continued experience of his or her normal learning
environment. In ANNs, replacement would also incur the
risk of catastrophic interference. We assumed that the
behavioral intervention was much smaller in scale than
continued everyday experience, and so limited the
intervention set to 10% the size of the original training set
(50 input-output mappings versus 508 in the original set).
Intervention continued for a limited duration (30 epochs of
training), at which point the intervention ceased and training
reverted to the original set. Intervention had the goal of
encouraging acquisition of the regular past tense rule.</p>
      <p>
        We manipulated the timing of intervention, from ‘early’ at
50 epochs in steps of 50 up to ‘late’ at 250 epochs (i.e., 5
stages: 50, 100, 150, 200, and 250) compared to the full
training trajectory of 1000 epochs. The importance of early
intervention has been stressed within a clinical setting,
under the view that plasticity reduces over time. Simple
feedforward ANNs have been claimed to capture a
reduction in plasticity through entrenchment effects
        <xref ref-type="bibr" rid="ref13">(Marchman, 1993)</xref>
        , though a broader set of mechanisms
may also produce reductions in plasticity, such as synaptic
pruning
        <xref ref-type="bibr" rid="ref22">(Thomas &amp; Johnson, 2006)</xref>
        .
      </p>
      <p>
        We assessed normalization with respect to changes in
performance on the original training set. We assessed
compensation with respect to changes in performance on
five generalization sets. These were:
• A novel rhyme set. Each novel verb shared two out of
three phonemes with a verb in the training set. There
were 410 regular verb rhymes, 20 no-change irregular
verb rhymes, 10 arbitrary irregular verb rhymes, and 76
vowel change irregular verb rhymes. Finally, there were
56 novel verbs only shared one phoneme with any verb
in the training set. This novel verb set has been used in
previous simulations
        <xref ref-type="bibr" rid="ref23">(e.g., Thomas &amp; Karmiloff-Smith,
2003)</xref>
        .
• A shadow training set. These were novel artificial verbs
regenerated using the same consonant-vowel templates
as the original training set (P&amp;M91) and in the same
proportions: 410 regular verbs, 20 no-change irregular
verbs, 10 arbitrary irregular verbs, and 68 vowel-change
irregular verbs.
• A novel set of 508 arbitrary irregular verbs generated
using the P&amp;M91 templates.
• A novel set of 508 no-change irregular verbs generated
using the P&amp;M91 templates.
• A novel set of 508 vowel-change irregular verbs
generated using the P&amp;M91 templates.
      </p>
      <p>For the novel rhyme set, generalization was assessed
according to accuracy of producing regularly inflected
forms. For irregular verbs in the shadow training set, and
for the three novel irregular sets, generalization was tested
according to accuracy of producing the target irregular
output form.</p>
      <p>We asked two questions. Did the intervention to
encourage a compensatory strategy produce any benefit at
the immediate end of the intervention period? And if so, did
any benefit persist after the intervention ceased so that it
was observable at the end of training? For the earliest
intervention, performance was therefore assessed at 80 and
1000 epochs. For the latest intervention, performance was
assessed at 280 and 1000 epochs. In each case, there were
10 replications of networks with different random seeds.</p>
      <p>This leaves the challenge of how to construct an
intervention set to encourage a compensatory strategy. In
the next section, we propose a method.</p>
    </sec>
    <sec id="sec-4">
      <title>A method for generating intervention sets for compensation</title>
      <p>The problem we needed to solve is how to choose the most
effective 50 intervention verbs among the hundreds of
thousands of possible artificial verbs possible within the
encoding scheme. The idea is intuitive: if we suppose some
of the input units are more important and decisive than
others, then the intervention verbs can be chosen based on
these features. Now the problem becomes how to identify
the key features within the original training set among the
original 57 dimensional input space. The extra data set
should be able to remedy a disordered ANN in its
generalization of the past tense rule. Broadly, the approach
we adopted was to take an ANN that had successfully
learned the past tense problem. We then varied the
activation level of input units and assessed the extent to
which this might generate the error on the output. This
should indicate the extent which input units encoded key
dimensions would influence the performance of learning.
Formally, we translated this challenge into an optimization
problem, specified as:</p>
      <p>( ( − , ∑ = , = 0 1(1)
In Formula (1), Y is the past tense matrix in the training data
set, while ( is the actual output of the ANN, is
the number of the final layer of the ANN. So, Formula (1)
attempts to select out the input units that contribute most to
the learning based on the training data set. is a
recursive function defined in Formula (2):
(
, !" ,
# = 1</p>
      <p>((2)
=</p>
      <p>$ %", ! &amp;, # &gt; 1
Since this is a combinatorial optimization problem, we used
a Genetic Algorithm (GA) approach to find the optimized
result. In this algorithm, = 25 features. However, after the
GA filtered out these features, a further selection was
necessary, since no artificial verbs could fully satisfied the
filtered features. In the final step, a subset of 5 or 6 features
were chosen to generate two possible intervention data sets.</p>
    </sec>
    <sec id="sec-5">
      <title>Key features selected to generate the intervention data sets</title>
      <p>After running the GA, two sets of features were constructed.
We refer to the first as the GA feature set, since it was
closest to a shortened version of the original filtered 25
features yet consistent with legal verbs within the P&amp;M91
encoding scheme. Novel verbs each contained 5 selected
features shown in Table 1. We refer to the second as the
Voice satisfied feature set. Novel verbs each contained 6
selected features shown in Table 2. These verbs were more
consistent with those present in the original training set in
terms of their voicing features. One might think of the first
intervention set as optimized but somewhat strange, and the
second as slightly less optimized but more natural given the
ANNs previous training experiences.</p>
      <p>Fifty novel verbs were created for each intervention set.
The target output for each novel verb was the regular
inflected past tense.</p>
      <p>Table 1: GA Featured intervention data set.</p>
      <p>Corresponding Unit
1
2
5
21
43</p>
    </sec>
    <sec id="sec-6">
      <title>Results</title>
      <p>Meaning
consonantal
voiced
consonantal
voiceless
vowel
voiced
The results of the intervention are shown in Table 3.</p>
      <p>Beginning with the early intervention condition, no reliable
improvement was observed on the original training set at the
end of intervention. If anything, intervention caused
performance on the training set to worsen. This is in line
with the view that normalization is difficult for a network
with limited capacity. Compensation was assessed via 5
novel verb sets assessing generalization of different
functions that might be extended from the original training
set. The first two, novel rhyme and shadow training set,
contain significant numbers of regular verbs which one
might expect to aided by the intervention. In both cases,
statistically reliable benefits of intervention were observed.</p>
      <p>Three sets considered the possibility of generalizing
irregular patterns. Since there is no systematic relation
between arbitrary mappings, one would not expect an
intervention effect on novel arbitrary verbs, and none was
found. However, both novel no-change and novel
vowelchange generalization sets showed benefits. This implies
that the intervention had better enabled the atypical network
to separate regular and irregular mappings within its
representational space, and so generalize both types of
general function to novel verbs with features that would
support these functions.</p>
      <p>We ran a series of omnibus ANOVAs to assess broader
patterns. To emphasize the possibility that timing of
intervention might have an effect, we focused on a
comparison between the earliest intervention point (50
epochs) and latest (250 epochs). Figure 2 demonstrates the
effect of intervention at intermediate time points. We first
examined training set performance, considering factors of
group (treated vs. untreated), intervention type (GA vs. V),
and timing (50 vs. 250 epochs), separately for the immediate
end of intervention and at the end of training. For
performance at the immediate end of intervention, there was
a reliable effect of the intervention (F(1,9)=12.96, p=.006),
with an effect size of ηp2=.59, corresponding to a worsening
of performance. The intervention effect was not modulated
by intervention type, nor by timing of intervention. For
performance at the end of training, there was no effect of the
intervention at 1000 epochs.
ηp2=.78); improvement depended on the generalization set normalization and indeed the compensatory strategy (while
used (F(1,9)=7.64, p=.022, ηp2=.46); and the intervention effective) initially caused performance to further diverge
effect was not modulated by timing of intervention. The from the typical trajectory. Benefits of intervention were
results at the end of training were similar, but with a possible across a wide stretch of the developmental
reduced intervention effect size (ηp2=.83), and now no trajectory, with little indication of reductions in plasticity
modulation depending on the type of intervention used. across the range of timing of interventions we considered.</p>
      <p>In sum, in line with our expectations, compensatory However, early interventions showed dissipating effects
strategies were effectively encouraged via the addition of an across development once the intervention was discontinued,
intervention set. Intervention sets did not achieve with the exact type of intervention becoming less relevant.
Table 3: Intervention results. UN=untreated, GA=GA feature intervention set, V=voice satisfied intervention set. Scores
show performance of the network prior to intervention, immediately following an intervention lasting 30 epochs, and at the
end of training of 1000 epochs, for untreated networks and networks treated with each intervention set. Performance is
measured by sum-squared error, where lower numbers represent better performance and higher numbers represent worse
performance. Reliable treatment effects are marked. Interventions were at five time points, 50, 100, 150, 200 and 250 epochs.</p>
    </sec>
    <sec id="sec-7">
      <title>Discussion</title>
      <p>
        In this work, we have sought to build on successful research
using ANNs to simulate atypical cognitive and language
development, to consider implications for behavioral
interventions to remediate developmental deficits. We
focused on the domain of past tense formation, which has
been a target of intervention for children with grammatical
deficits
        <xref ref-type="bibr" rid="ref11 ref19 ref6">(Ebbels, 2007; Kulkarni et al., 2014; Seeff-Gabriel
et al., 2012)</xref>
        . Rather than a realistic model of these
interventions, our goal here was more preliminary: to
explore methods for deriving possible intervention sets, to
assess their impact on different areas of performance, to
assess the influence of timing of intervention, and to assess
the extent to which any gains were sustained following the
cessation of intervention. However, we followed one of the
broad tenets of an intervention called grammar facilitation,
one of the most widely investigated methods for intervening
to address grammar deficits in school age children. In
grammar facilitation, the aim is to make target forms more
frequent, which is hypothesized to help the child identify
grammatical rules and give them practice at producing
forms they tend to omit
        <xref ref-type="bibr" rid="ref7">(Ebbels, 2014)</xref>
        . In line with this
view, our intervention added information to the training set
of an ANN model for a fixed period, to increase the salience
of certain regularities in the problem domain.
      </p>
      <p>Our results demonstrated that, where a language deficit
arises due to limitations in processing capacity,
compensation (optimization on a subset of the problem
domain) is more readily achievable than normalization
(improvement on the whole problem domain), and the
particular training items chosen to effect the compensation
can alter the size of the effect. Within the intervention
window we considered, we found no reductions in
receptiveness of the ANN to remediation, indicating no
entrenchment or reductions in plasticity. However, benefits
did dissipate once the intervention had ceased.</p>
      <p>
        Returning to the target phenomenon, in reality, behavioral
interventions to remediate developmental disorders of
language and cognition are multi-faceted. They are usually
interactional and social, and involve emotional and
motivational factors in the child, as well as cognitive factors.
There are myriad causes of variability in children’s abilities,
be they biological, psychological, environmental, or social –
factors that must be considered in planning preventions or
interventions
        <xref ref-type="bibr" rid="ref2">(Beauchaine, Neuhaus, Brenner &amp;
GatzkeKopp, 2008)</xref>
        . Clinical practice is driven by a range of
principles including the emerging evidence base and the
therapeutic setting, as well as the child and family’s goals.
Within approaches targeting speech and language needs
directly, the clinician may form a hypothesis as to (i) the
nature of the difficulty and (ii) what will be optimally
effective for a child. The results of intervention will further
refine these hypotheses.
      </p>
      <p>Nevertheless, the quality of neurocomputational
mechanisms of learning and development is a key
constraining factor, given that these mechanisms underlie
behavior, and given that their plasticity is crucial in
achieving remediation. We believe there is value in
computational modeling work to further understand the
mechanistic basis of atypical development and how deficits
might be remediated by behavioral means.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>This research is supported by the National Natural Science
Foundation of China (61402309) and UK ESRC grant
RES062-23-2721.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Abel</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Willmes</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Huber</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          (
          <year>2007</year>
          ).
          <article-title>Model-oriented naming therapy: Testing predictions of a connectionist model</article-title>
          .
          <source>Aphasiology</source>
          ,
          <volume>21</volume>
          (
          <issue>5</issue>
          ),
          <fpage>411</fpage>
          -
          <lpage>447</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Beauchaine</surname>
            ,
            <given-names>T. P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Neuhaus</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brenner</surname>
            ,
            <given-names>S. L.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>GatzkeKopp</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          (
          <year>2008</year>
          ).
          <article-title>Ten good reasons to consider biological processes in prevention and intervention research</article-title>
          .
          <source>Development and Psychopathology</source>
          ,
          <volume>20</volume>
          ,
          <fpage>745</fpage>
          -
          <lpage>774</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Best</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fedor</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hughes</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kapikian</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Masterson</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roncoli</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fern-Pollak</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Thomas</surname>
            ,
            <given-names>M. S. C.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>Intervening to alleviate word-finding difficulties in children: Case series data and a computational modelling foundation</article-title>
          .
          <source>Cognitive Neuropsychology. Article first published online: 25 FEB</source>
          <year>2015</year>
          , doi: 10.1080/02643294.
          <year>2014</year>
          .1003204
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Borovsky</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Elman</surname>
            ,
            <given-names>J. L.</given-names>
          </string-name>
          (
          <year>2006</year>
          ).
          <article-title>Language input and semantic categories: a relation between cognition and early word learning</article-title>
          .
          <source>Journal of Child Language</source>
          ,
          <volume>33</volume>
          (
          <issue>4</issue>
          ),
          <fpage>759</fpage>
          -
          <lpage>790</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Daniloff</surname>
            ,
            <given-names>R. G.</given-names>
          </string-name>
          (
          <year>2002</year>
          ).
          <article-title>Connectionist approaches to clinical problems in speech and language</article-title>
          . Erlbaum: Mahwah, NJ
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Ebbels</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2007</year>
          ).
          <article-title>Teaching grammar to school-aged children with specific language impairment using Shape Coding</article-title>
          .
          <source>Child Language Teaching &amp; Therapy</source>
          ,
          <volume>23</volume>
          ,
          <fpage>67</fpage>
          -
          <lpage>93</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Ebbels</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>Effectiveness of intervention for grammar in school-aged children with primary language deficits</article-title>
          .
          <source>Child Language Teaching &amp; Therapy</source>
          ,
          <volume>30</volume>
          (
          <issue>1</issue>
          ),
          <fpage>7</fpage>
          -
          <lpage>40</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Fedor</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Best</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Masterson</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Thomas</surname>
            ,
            <given-names>M. S. C.</given-names>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>Towards identifying principles for clinical intervention in developmental language disorders from a neurocomputational perspective. DNLTechreport2013-1 (www</article-title>
          .psyc.bbk.ac.uk/research/DNL)
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Gomez</surname>
            ,
            <given-names>R. L.</given-names>
          </string-name>
          (
          <year>2005</year>
          ),
          <article-title>Dynamically guided learning</article-title>
          . In M. Johnson &amp; Y. Munakata (Eds.)
          <article-title>Attention and Performance XXI (pp</article-title>
          .
          <fpage>87</fpage>
          -
          <lpage>110</lpage>
          ). Oxford: OUP.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Harm</surname>
            ,
            <given-names>M. W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McCandliss</surname>
            ,
            <given-names>B. D.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Seidenberg</surname>
            ,
            <given-names>M. S.</given-names>
          </string-name>
          (
          <year>2003</year>
          ).
          <article-title>Modeling the successes and failures of interventions for disabled readers</article-title>
          .
          <source>Scientific Studies of Reading</source>
          ,
          <volume>7</volume>
          ,
          <fpage>155</fpage>
          -
          <lpage>182</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Kulkarni</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pring</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Ebbels</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>Evaluating the effectiveness of therapy based around Shape Coding to develop the use of regular past tense morphemes in two children with language impairments</article-title>
          .
          <source>Child Language Teaching &amp; Therapy</source>
          ,
          <volume>30</volume>
          (
          <issue>3</issue>
          ),
          <fpage>245</fpage>
          -
          <lpage>254</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Law</surname>
          </string-name>
          , J.,
          <string-name>
            <surname>Campbell</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roulstone</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Adams</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Boyle</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          (
          <year>2007</year>
          ).
          <article-title>Mapping practice onto theory: The speech and language practitioner's construction of receptive language impairment</article-title>
          .
          <source>International Journal of Language and Communication Disorders</source>
          ,
          <volume>43</volume>
          ,
          <fpage>245</fpage>
          -
          <lpage>63</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Marchman</surname>
            ,
            <given-names>V. A.</given-names>
          </string-name>
          (
          <year>1993</year>
          ).
          <article-title>Constraints on plasticity in a connectionist model of the English past tense</article-title>
          .
          <source>Journal of Cognitive Neuroscience</source>
          ,
          <volume>5</volume>
          ,
          <fpage>215</fpage>
          -
          <lpage>234</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Mareschal</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Thomas</surname>
            <given-names>M. S. C.</given-names>
          </string-name>
          (
          <year>2007</year>
          )
          <article-title>Computational modeling in developmental psychology</article-title>
          .
          <source>IEEE Transactions on Evolutionary Computation (Special Issue on Autonomous Mental Development)</source>
          ,
          <volume>11</volume>
          ,
          <fpage>137</fpage>
          -
          <lpage>150</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Onnis</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Monaghan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Christiansen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Chater</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          (
          <year>2005</year>
          ).
          <article-title>Variability is the spice of learning, and a crucial ingredient for detecting and generalizing in nonadjacent dependencies</article-title>
          . In: K.
          <string-name>
            <surname>Forbus</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Gentner</surname>
          </string-name>
          &amp; T. Regier (Eds.),
          <source>Proceedings of the 26th Annual Conference of the Cognitive Science Society</source>
          (pp.
          <fpage>1047</fpage>
          -
          <lpage>1052</lpage>
          ). Mahwah, NJ: Erlbaum.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>Plaut</surname>
            ,
            <given-names>D.C.</given-names>
          </string-name>
          (
          <year>1996</year>
          ).
          <article-title>Relearning after damage in connectionist networks: Toward a theory of rehabilitation</article-title>
          .
          <source>Brain and Language</source>
          ,
          <volume>52</volume>
          ,
          <fpage>25</fpage>
          -
          <lpage>82</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Plunkett</surname>
          </string-name>
          ， K. ， &amp;
          <string-name>
            <surname>Marchman</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          (
          <year>1991</year>
          ).
          <article-title>U-shaped learning and frequency effects in a multi-layered perception: Implications for child language acquisition</article-title>
          .
          <source>Cognition</source>
          ,
          <volume>38</volume>
          ,
          <fpage>43</fpage>
          -
          <lpage>102</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <surname>Poll</surname>
            ,
            <given-names>G. H.</given-names>
          </string-name>
          (
          <year>2011</year>
          ).
          <article-title>Increasing the odds: Applying emergentist theory in language intervention</article-title>
          . Language, Speech, and Hearing Services in Schools,
          <volume>42</volume>
          ,
          <fpage>580</fpage>
          -
          <lpage>591</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <surname>Seeff-Gabriel</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chiat</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Pring</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          (
          <year>2012</year>
          ).
          <article-title>Intervention for co-occurring speech and language difficulties</article-title>
          .
          <source>Child Language Teaching &amp; Therapy</source>
          ,
          <volume>20</volume>
          ,
          <fpage>123</fpage>
          -
          <lpage>35</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <surname>Thomas</surname>
            ,
            <given-names>M. S. C.</given-names>
          </string-name>
          (
          <year>2005</year>
          ).
          <article-title>Characterising compensation</article-title>
          .
          <source>Cortex</source>
          ,
          <volume>41</volume>
          (
          <issue>3</issue>
          ),
          <fpage>434</fpage>
          -
          <lpage>442</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <surname>Thomas</surname>
            ,
            <given-names>M. S. C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Forrester</surname>
            ,
            <given-names>N. A.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Ronald</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>Multiscale modeling of gene-behavior associations in an artificial neural network model of cognitive development</article-title>
          .
          <source>Cognitive Science. Article first published online: 3 APR</source>
          <year>2015</year>
          , doi: 10.1111/cogs.12230
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <surname>Thomas</surname>
            ,
            <given-names>M. S. C.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Johnson</surname>
            ,
            <given-names>M. H.</given-names>
          </string-name>
          (
          <year>2006</year>
          ).
          <article-title>The computational modelling of sensitive periods</article-title>
          .
          <source>Developmental Psychobiology</source>
          ,
          <volume>48</volume>
          (
          <issue>4</issue>
          ),
          <fpage>337</fpage>
          -
          <lpage>344</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <surname>Thomas</surname>
            ,
            <given-names>M. S. C.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Karmiloff-Smith</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2003</year>
          ).
          <article-title>Modeling language acquisition in atypical phenotypes</article-title>
          .
          <source>Psychological Review</source>
          ,
          <volume>110</volume>
          ,
          <fpage>647</fpage>
          -
          <lpage>682</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <surname>Thomas</surname>
            ,
            <given-names>M. S. C.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Knowland</surname>
            ,
            <given-names>V. C. P.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>Modelling mechanisms of persisting and resolving delay in language development</article-title>
          .
          <source>Journal of Speech</source>
          , Language, and Hearing Research,
          <volume>57</volume>
          (
          <issue>2</issue>
          ),
          <fpage>467</fpage>
          -
          <lpage>483</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>