=Paper= {{Paper |id=Vol-1419/paper0062 |storemode=property |title=Hebbian Learning Mechanisms Help Explain the Maturation of Multisensory Speech Integration in Children with Autism Spectrum Disorder (ASD) and with Typical Development (TD): a Neurocomputational Analysis. |pdfUrl=https://ceur-ws.org/Vol-1419/paper0062.pdf |volume=Vol-1419 |dblpUrl=https://dblp.org/rec/conf/eapcogsci/CuppiniUMRFM15 }} ==Hebbian Learning Mechanisms Help Explain the Maturation of Multisensory Speech Integration in Children with Autism Spectrum Disorder (ASD) and with Typical Development (TD): a Neurocomputational Analysis.== https://ceur-ws.org/Vol-1419/paper0062.pdf
       Hebbian Learning Mechanisms Help Explain the Maturation of Multisensory Speech
  Integration in Children with Autism Spectrum Disorder (ASD) and with Typical Development
                              (TD): a Neurocomputational Analysis.

                                       Cristiano Cuppini (cristiano.cuppini@unibo.it)

                                            Mauro Ursino (mauro.ursino@unibo.it)

                                            Elisa Magosso (elisa.magosso@unibo.it)
        Department of Electric, Electronic and Information Engineering, University of Bologna, 2 Viale Risorgimento
                                                    Bologna, 40136, Italy

                                            Lars A. Ross (lars.ross@einstein.yu.edu)

                                           John J. Foxe (john.foxe@einstein.yu.edu)

                                    Sophie Molholm (sophie.molholm@einstein.yu.edu)
         Department of Pediatrics and Neuroscience, Albert Einstein College of Medicine, 1225 Morris Park Avenue
                                                 Bronx, NY 10461, USA


                            Abstract                                       significantly improved when one can see the speaker’s
                                                                           articulations. Accordingly, the appropriate development of
  Cognitive tasks such as communication and speech
  comprehension rely on the brain’s ability to exploit and                 multisensory speech integration (MSI) greatly affects a
  integrate sensory information of different modalities.                   child’s ability to relate with others. Ample experimental
  Accordingly, the appropriate development of multisensory                 evidence has shown that MSI appears to be highly immature
  speech integration (MSI) greatly influences a child’s ability to         at birth and that continues to develop late into childhood
  successfully relate with others. Several experimental findings           (Brandwein et al., 2010). Moreover, children with autism
  have shown that speech intelligibility is affected by                    spectrum disorder (ASD) presenting impaired MSI early in
  visualizing a speaker’s articulations, and that MSI continues
  developing late into childhood. This work aims at developing
                                                                           life, show an amelioration in the adolescent years (de Boer-
  a network to analyze the role of the sensory experience during           Schellekens et al., 2013; Foxe et al., 2015). These evidences
  the early stages of life, as a mechanism responsible for the             suggest that there may be delays in the maturation of MSI
  maturation of these integrative abilities in teenagers. We               for children with ASD that resolve at this point. Multiple
  extended a model realized to study multisensory integration in           studies have shown multisensory processing deficits in ASD
  cortical regions (Magosso et al., 2012; Cuppini et al, 2014) by          in the absence of comparable unisensory deficits, suggesting
  incorporating a multisensory area known to be involved in                that they represent impairment of neural processes that have
  audiovisual speech processing, the superior temporal sulcus
  (STS). The model suggests that the maturation of MSI is                  direct and specific impact on MSI. However, the neural
  primarily due to the maturation of direct connections among              basis of the impairment remains unknown.
  primary unisensory regions. This process was the results of a               A region of particular interest for the maturation of MSI is
  training phase during which the network was exposed to                   the superior temporal sulcus (STS), an association cortex
  sensory-specific and cross-sensory stimuli, and excitatory               involved in speech perception (Molholm et al., 2013) that is
  projections among the unisensory regions of the model were               also frequently implicated in audiovisual multisensory
  subjected to Hebbian rules of potentiation and depression.
                                                                           processing (Bolognini et al., 2009). This region must be
  With such a model, we also analyzed the acquisition of adult
  MSI abilities in ASD children, and we were able to explain               considered in the context of its feedforward inputs from
  the delayed maturation as result of a lower level of                     auditory and visual cortices. Converging evidence reveals
  multisensory exposures during early phases of life.                      that MSI occurs at very early stages of cortical processing
                                                                           and in sensory cortical regions, although the functional role
  Keywords: Autism Spectrum Disorder (ASD); Neural
  Networks; Hebbian Learning Rules; Multisensory Speech                    of early MSI (at the onset of cortical sensory processing in
  Integration; McGurk Effect                                               some cases; Molholm et al., 2002) remains unknown.
                                                                              Several experimental data pointed out that auditory
                        Introduction                                       speech recognition is relatively mature at 5 to 9 years of
                                                                           age, approaching adult-like performances (e.g., Fallon,
The brain’s ability to exploit and integrate sensory
                                                                           Trehub & Schneider, 2000; Kraus, Koch, McGee, Nicol, &
information of different modalities is fundamental not just
                                                                           Cunningham, 1999), at ages where multisensory speech
for simple detection tasks, but also for more demanding
                                                                           processing is not (Foxe et al, 2015).
perceptual-cognitive functions, such as those involved in
communication. For example, the intelligibility of speech is


                                                                     389
   Such observations suggest a neural model in which the                  excitatory and          inhibitory   connections     (intra-area
maturation of MSI in speech perception follows from the                   connections, L in Fig. 1), described by a Mexican hat
reinforcement of direct “cross-modal” excitatory                          disposition, i.e., proximal units excite reciprocally and
connections between auditory and visual speech                            inhibit more distal ones. This disposition produces an
representations in unisensory cortices. In this case, it can be           “activation bubble” in response to a specific auditory or
assumed that connections among unisensory areas are                       visual input: not only the neural element representing that
initially relatively ineffective, but that they strengthen as a           individual feature is activated, but also the proximal ones
consequence of relevant multisensory experiences through a                linked via sufficient lateral excitation. This arrangement can
Hebbian learning mechanism. Thus, multisensory                            have important consequences for the correct perception of
experiences would affect only the ability of STS elements                 phonemes, for instance resulting in the illusory perceptual
to detect multisensory stimuli, via a reciprocal                          phenomena like the well-known McGurk effect (see section
reinforcement of unisensory activities when both are active,              Results). In this work, lateral intra-area connections are not
but it would not produce any additional level of information              subject to training, since we assumed that this process took
to the STS in case of unisensory stimulation.                             place earlier in life than the acquisition of MSI.
   The aim of the present work is to test the feasibility of this
model, and its consequences by using a computational
model inspired by neurophysiology and based on a previous
model implemented to study cortical multisensory
interaction (Magosso et al., 2012; Cuppini et al., 2014). In
particular, with the model we wish to i) analyze possible
mechanisms underlying the maturation of MSI; ii) test the
model's ability to reproduce different results concerning
speech MSI in terms of accuracy as well whether it
produces the well-known McGurk illusion; and iii) provide
possible explanations of the neural processing differences
that could lead to a slower maturation of MSI in participants
with ASD, followed by a full recovery during adolescence.
   In particular, we describe the training mechanisms
implemented to simulate the maturation phase and we test a
hypothesis to explain ASD deficits in speech MSI: a
different multisensory experience during the maturation
process, due to a lack of attention in young children
(attentional bias) is responsible for the different maturation                   Figure 1: Architecture of the network. Each circle
in ASD. All the simulated responses are compared with                        represents a unit. Each region is made of 180 elements.
behavioral data present in the literature.                                Dashed lines represent weigths (Wav, Wva) acquired during a
                                                                           crossmodal training, which simulates associative learning
                           Method                                             between speech sound and gestures. Units in the same
The model consists of a multisensory region (STS) of N                     region are reciprocally connected through lateral synapses
multisensory units (N = 180), receiving excitatory                        (La, Lv and Ls), described by a Mexican Hat function. Units
projections from two arrays of N auditory and N visual units              in the unisensory regions send excitatory connections (Wsv,
(see Fig. 1). Unit response to any input is described with a              Wsa) to the corresponding elements in the multisensory area.
first order differential equation, which simulates the
integrative properties of the cellular membrane, and a                       Furthermore, units in the auditory and visual regions also
steady-state sigmoidal relationship, that simulates the                   receive an external input (corresponding to a speech sound
presence of a lower threshold and an upper saturation for                 and/or a gesture representation of the presented phoneme).
neural activation. The saturation value is set at 1, i.e., all            These visual and auditory inputs are described with a
outputs are normalized to the maximum. In the following,                  gaussian function. The central point of the Gaussian
the term “activity” is used to denote unit output.                        function corresponds to a specific speech sound/gesture, and
   Auditory and visual units are devoted to the processing                its amplitude with the stimulus intensity; the standard
of information regarding speech sounds and speech gestures                deviation accounts for the uncertainty of the stimulus
(i.e. lip and face movements; see e.g., Bernstein &                       representation. In this model, for simplicity the two inputs
Liebenthal, 2014), and are topologically organized                        are described with the same function. To reproduce
according to a similarity principle. This means that two                  experimental variability, the external input had been added
similar sounds or lips movements activate proximal neural                 with a noisy component, taken from a uniform distribution.
groups in these areas. The topological organization in these              Moreover, since the outside inputs are mediated by long-
cortical regions is realized assuming that each unit is                   range excitatory connections, their temporal aspects are
connected with other elements of the same area via lateral                described by using a second order kinetics, similar to that
                                                                          commonly adopted to mimic the glutamatergic synaptic


                                                                    390
response (i.e., an impulse produces a response similar to an            visual stimuli and 20% of auditory stimuli alone. During the
alpha function, see also Jansen and Rit, 1995). These                   training phase we used suprathreshold stimuli at their
kinetics are characterized by different time constants (a and          highest level of efficacy, i.e. stimuli able to excite
v for the two modalities) simulating the auditory and visual           unisensory units close to the upper saturation, in order to
processing in the cortex.                                               speed up the modeling process.
   Finally, we consider a cross-modal input, computed
assuming that units of the two areas could be reciprocally
linked via long-range excitatory connections (Wav, Wva, in
Fig 1), described by a pure latency, and the same second-
order kinetics employed to mimic the temporal aspects of
the external inputs. We assume that, in the network’s initial
configuration, corresponding to an early period of life, these
connections have negligible strength, bur are subject to a
training phase (see below) during which the network learns
to associate the auditory (speech sounds) and visual (speech
gestures) representations of the same phonemes.
   The third area simulates multisensory units in a cortical
region (STS) known to be involved in the phoneme
comprehension tasks, and MSI. These units are linked via
lateral connections with a Mexican-hat arrangement,                     Figure 2: Representation of training mechanisms simulating
implementing a similarity principle (Ls, in Fig. 1).                    the associative learning between sounds (Auditory area) and
   Inputs to the multisensory area were generated by long-                   gestures (Visual area) of units of speech. In case of
range excitatory connections from unisensory regions (Wsv,                multisensory stimuli, speech sounds are presented along
Wsa): we used a delayed onset (pure latency) and a second-               with corresponding lip movements. Thanks to the Hebbian
order kinetics to mimic the temporal aspects of these inputs.             learning rules, connections among contemporarily active
The connections between unisensory and multisensory                         units are reinforced. Hence, the network learns how to
regions were realized with a Gaussian function, assuming                associate the auditory and visual representations of the same
stronger and more focused connections coming from the                     speech events, and this knowledge is implemented in the
auditory region (Wsa), and more diffuse but weaker                          synaptic architecture between the unisensory regions.
connections coming from the visual area (Wsv). This
asymmetric connectivity helps explain the experimental                    These stimuli were generated through a uniform
results present in the literature about the better abilities in         distribution of probability. Each stimulus lasted 130 ms,
speech identification in case of auditory stimulation,                  during which, after an initial transient period, the
compared with the poor performance in the case of visual                connections among visual and auditory representations of
inputs. This different representation is assumed being the              the same phonemes were crafted by using Hebbian
final state of a process of unisensory maps refinement in               algorithms of long-term potentiation (LTP) and long-term
STS, which takes place in early stages of life. This                    depression (LTD). In particular, we chose a presynaptic
development could be included in future implementations of              gating rule, which means that the training algorithm only
the model, as an earlier training phase based on the evidence           modifies the connections coming from an active unit, and
that auditory stimuli are more informative than the visual              their strength is modified based on the activity of the
representations of words. The feedforward connectivity is               postsynaptic units. As an example, if a presynaptic auditory
also responsible for the presence of early weak integrative             element is active, it reinforces connections targeting a
phenomena in the younger ASD group (Fig. 1A, Foxe et                    simultaneously active visual unit (likely representing the
al.). In the present model, neither of these connections are            same speech unit), and weakens connections with silent
modified during the learning period due to the relative                 visual elements (likely those coding for different speech
stability of representations of unisensory speech features at           inputs, see Fig. 2) (see Gerstner, W., & Kistler, 2002). In
the ages considered (~7 years of age upward). Finally, the              order to establish this correlation, the activity of the
output of the STS units is compared with a fixed threshold              individual units (both presynaptic and postsynaptic) is
to mimic the perceptual ability to correctly identify speech            compared with a given threshold, to determine whether the
(detection threshold).                                                  unit can be considered active or silent. The strengthening
                                                                        and depression processes are also subject to a saturation
Training the Network                                                    rule: which means that each single connection cannot
  We simulated a normal training period by presenting                   overcome a maximum value, nor decrease below zero.
thousands (up to 25.000 inputs) of unisensory and                         Finally, to simulate the delayed developmental processes
multisensory speech representations to the network, to                  taking place in ASD children, we trained and tested the
mimic a normal experience with speech stimuli: specifically             network by using lower multisensory experiences, precisely
we trained the network with 80% of congruent auditory and               20% multisensory stimuli plus 80% auditory stimuli.



                                                                  391
                          Results                                       Developmental process and audio-visual speech
A first set of simulations was performed to evaluate the                recognition
network’s ability to correctly identify speech before the                 The model in its initial state was repeatedly stimulated
model had been exposed to training (Fig. 3). Already mature             with modality-specific and cross-modal inputs (see section
unisensory maps in auditory and visual regions were                     Training) in order to simulate the experience of a child with
supposed in this model, as described in previous section.               different sensory representations of phonemes. The weights
                                                                        of the inter-area projections among unisensory elements in
                                                                        the visual and auditory regions adjusted according to
                                                                        Hebbian dynamics. We tested MSI in the final “adult-like”
                                                                        configuration and throughout the developmental process,
                                                                        using the same testing paradigm used to evaluate the MSI
                                                                        behavior in the immature phase.




     Figure 3: Average word recognition performance (%
   correct) before training. Panel A reports the percentage
    correct speech recognition (y-axes) in case of auditory
stimulation (dashed line), or multisensory stimulation (solid
line). These data represent the mean of correct recognitions
over 3600 different presentations for each level of stimulus
 efficacy (reported in the x-axes). A correct recognition has
  been computed comparing the activity elicited in a unit in
    the STS region, coding for a specific phoneme, with a
threshold (fixed at the 30% of its maximum value). Panel B
reports the Multisensory Speech Integration (MSI) abilities
   of the network, computed as the difference between the
    percentage of correct detections in case of crossmodal
 stimulations and its counterpart in case of auditory stimuli.

   In this phase, representations of speech in the two
unisensory regions are independently activated by the two
modality-specific external stimuli, and do not interact
through direct long-range excitatory projections between the
unisensory cortical regions, which are still ineffective.               Figure 4: MSI acquisition with a multisensory experience of
Hence, they independently stimulate the corresponding units                20% during the training (red lines) compared with the
in STS region. As shown in Fig. 3, in this initial condition,                   normal multisensory experience (blue lines).
an effective auditory stimulus alone is sufficient to produce
a high percentage of correct speech sound identifications, as              One possible explanation for reduced MSI in ASD is that
in mature adult-like behaviour. If the auditory stimulus is             learning is less effective in this group. A possible
coupled with a simultaneous visual representation of the                explanation tested here is that these individuals experience
same phoneme, the network shows some benefit, although                  fewer multisensory exposures, possibly due to how attention
this is relatively low and no greater than 20% MSI gain over            is allocated (e.g., suppression of unattended signals;
all stimuli and levels of efficacy.                                     selectively focusing on one sensory modality at a time; not
   So, the network in its initial stage is characterized by: i)         looking at faces consistently). We therefore tested the
mature abilities in speech-recognition tasks in case of                 impact of percentage of multisensory versus unisensory
auditory-alone stimulation, but ii) poor multisensory                   exposures on model performance on the maturation of MSI.
integration (see Fig. 3). These results are in agreement with              Fig. 4 reports the weight maturation (left panel) and MSI
what one would expect prior to significant training, and                abilities (right panel) at different epochs, for a training
indeed are well aligned with what we see in our data in                 phase in which the network was exposed to a sensory
which younger children show relatively immature ability to              training with just 20% of multisensory stimuli.
benefit from MSI, whereas auditory speech recognition is                   Even with such a poor multisensory experience, the
significantly closer to mature performance levels.                      network can reach “TD-like” behaviour in terms of MSI as
                                                                        shown in Fig. 4, although this maturation requires 15,000


                                                                  392
training epochs. This result suggests that multisensory                 different perceptual behaviors based on words
integration in the model strongly depends on connections                representation as a collection of phonetic features in a
from the visual to the auditory region.                                 topographically organized feature space.
                                                                           Although the previous computational efforts simulated
Simulation of the McGurk effect                                         experimental data quite well, none of them was able to
   An important consequence of training in our model is that            explain the maturation of MSI in speech perception or the
the audio-visual inference becomes stronger after training,             different developmental trajectory for ASD, or how these
because of connection-weight reinforcement among                        capabilities are instantiated in the circuit.
unisensory areas. This change have important consequences                  The present model, in its mature architecture, simulates
in the development of audio-visual illusions. Since                     many experimental findings present in literature regarding
unisensory areas in our model code for speech, a typical                speech MSI. From this point of view, the fundamental
illusion consists in the well-known McGurk effect (McGurk               assumption is that the adult configuration implements a two-
& McDonald, 1976). In this illusion, incongruent auditory               step cross-modal integration: the first at the level of
speech is dubbed onto visual speech and the resulting                   unisensory areas, mediated by the cross-modal connections
auditory speech percept corresponds to a fusion of the                  between visual and auditory regions; the second at the level
auditory and visual speech stimuli, or to the visual speech             of the multimodal area, due to the presence of convergent
stimulus, but not to the veridical auditory speech stimulus.            feedforward connections. With this model, we reproduced
   We performed an additional set of simulations with the               the improvement in correct phoneme recognition in
model (both in the mature and immature configurations) to               audiovisual vs auditory conditions at different signal-to-
reproduce a McGurk-type situation. Specifically, we                     noise levels (Foxe et al., 2015); and, we simulated the main
presented mismatched (at four-position distance) auditory-              aspects of the McGurk effect.
visual speech to the network and analyzed the activities                   A second important aspect of our study is the capacity to
elicited in all areas. We say that the McGurk effect is                 mimic and to understand the developmental differences
evident when the detected phoneme (computed as the                      between TD subjects and ASD children regard the cross-
barycenter of activity in the multisensory region) is different         modal abilities observed with age. In particular, the model
from that used in the auditory input. The network in the                explains results of a recent study by Foxe et al. (2015),
immature configuration is characterized by limited visual               using two main assumptions. First, the feedforward
influence on the speech percept. Therefore, the activity in             connections from unisensory areas to the multisensory area
the auditory region is almost unaffected by the visual                  are already mature in the early age (here, this corresponds to
stimulus. In this case, the auditory modality plays the                 the condition of the untrained network) and the auditory
dominant role in guiding speech perception. In the 42.5% of             feedforward connections are stronger than the visual ones.
presentations, the model identifies the auditory input                  Second, the cross-modal connections between unimodal
correctly, while the McGurk effect is present less than 30%             areas are created during the development, under the pressure
of the time. In the remaining 27.2% of cases, no phoneme                of a multimodal environment (i.e., auditory + visual stimuli)
reaches the detection threshold.                                        and this process is faster in TD subjects than in ASDs. This
   After training, the model is much more susceptible to the            assumption agrees with the diffuse idea that ASD subjects
AV illusion, with responses affected by the visual                      have a decreased long-range connectivity, and that autism is
information on almost 72% of the simulation trials.                     a functional disconnection syndrome, in which the core of
                                                                        deficit derives from the poor capacity to functionally
                        Discussion                                      connect remote regions of the brain (Melillo & Leisman,
                                                                        2009). Since the reason for this decreased connectivity is
   Different computational models have been developed in
                                                                        still unclear, the model tested a possible scenario where a
recent years to investigate the general problem of                      reduced number of cross-modal stimuli (reflecting a reduced
multisensory integration in the brain (see Cuppini et al.,              attention of the subject to the external world), is a likely
2011 and Ursino et al., 2014 as a review). Some of them, in
                                                                        mechanism responsible for the differences in TD and ASD.
agreement with several psychophysical and behavioral data,
                                                                           These differences may lead to some testable predictions:
are based on a Bayesian approach (Anastasio et al., 2000;               from the results about the training phase, when can expect
Knill and Pouget, 2004; Körding et al., 2007). Others                   that ASD children trained with a high percentage of cross-
assume that integration is an emergent property based on
                                                                        modal stimuli, could exhibit a normal or at least a quicker
network dynamics (Patton and Anastasio, 2003; Ursino et
                                                                        MSI maturation. A second prediction is that as a
al., 2009). Finally some models have been realized to deal              consequence of poor cross-modal connections among
with the problem of multisensory integration in semantic                unisensory areas, young individual with ASDs have a less
memory and lexical aspects (Rogers et al., 2004; Ursino et
                                                                        evident McGurk effect, but at the end of the developmental
al., 2010, 2015).
                                                                        phase, this illusion becomes comparable in the two classes.
   Concerning the specific problem of speech recognition,               The first prediction is still to be tested; the second is
Ma et al (Ma et al., 2009) implemented a Bayesian model of              supported by some experimental results in the literature, in
optimal cue integration able to explain visual influence on
                                                                        particular by comparing data across different studies.
auditory perception in a noisy environment. They explained



                                                                  393
However, it deserves a deeper investigation through a                 Körding, K.P., Beierholm, U., Ma, W.J., Quartz, S.,
single, ideally longitudinal study.                                     Tenenbaum, J.B., & Shams, L. (2007). Causal inference
  Future developments of this model may include a more                  in multisensory perception. PLoS ONE 2 (9), e943.
detailed and biologically realistic description of the                Kraus, N., Koch, D.B., McGee, T.J., Nicol, T.G., &
unisensory areas, and the inclusion of further regions to               Cunningham, J. (1999). Speech-Sound Discrimination in
simulate the role in MSI and speech perception played by                School-Age        Children.      Psychophysical       and
subcortical structures, like the thalamus and basal ganglia.            Neurophysiologic Measures. Journal of Speech,
                                                                        Language, and Hearing Research, 42(5), 1042-1060.
                       References                                     Ma, W.J., Zhou, X., Ross, L.A., Foxe, J.J., & Parra, L.C.
Anastasio, T.J., Patton, P.E., & Belkacem-Boussaid, K.                  (2009). Lip-reading aids word recognition most in
  (2000) Using Bayes rule to model multisensory                         moderate noise: a Bayesian explanation using high-
  enhancement in the superior colliculus. Neural                        dimensional feature space. PLoS One, 4(3), e4638.
  Computation, 12, 1165–1187                                          Magosso, E., Cuppini, C., & Ursino, M. (2012). A neural
Bernstein, L.E., & Liebenthal, E. (2014). Neural pathways               network model of ventriloquism effect and aftereffect.
  for visual speech perception. Frontiers in neuroscience, 8.           PloS one, 7(8), e42503.
de Boer-Schellekens, L., Keetels, M., Eussen, M., &                   McGurk H, MacDonald J. (1976). Hearing lips and seeing
  Vroomen, J. (2013). No evidence for impaired                          voices. Science, 264, 746-8.
  multisensory integration of low-level audiovisual stimuli           Melillo, R., & Leisman, G. (2009). Autistic spectrum
  in adolescents and young adults with autism spectrum                  disorders as functional disconnection syndrome. Reviews
  disorders. Neuropsychologia, 51(14), 3004-3013.                       in the Neurosciences, 20(2), 111-132.
Bolognini, N., Miniussi, C., Savazzi, S., Bricolo, E., &              Molholm, S., Ritter, W., Murray, M.M., Javitt, D.C.,
  Maravita, A. (2009). TMS modulation of visual and                     Schroeder, C. E., & Foxe, J. J. (2002). Multisensory
  auditory processing in the posterior parietal cortex.                 auditory–visual interactions during early sensory
  Experimental brain research, 195(4), 509-517.                         processing in humans: a high-density electrical mapping
Brandwein, A.B., Foxe, J.J., Russo, N.N., Altschuler, T.S.,             study. Cognitive brain research, 14(1), 115-128.
  Gomes, H., & Molholm, S. (2011). The development of                 Molholm, S., Mercier, M.R., Liebenthal, E., Schwartz, T.H.,
  audiovisual multisensory integration across childhood and             Ritter, W., Foxe, J.J., & De Sanctis, P. (2014). Mapping
  early adolescence: a high-density electrical mapping                  phonemic processing zones along human perisylvian
  study. Cerebral Cortex, 21(5), 1042-1055.                             cortex: an electro-corticographic investigation. Brain
Cuppini, C., Magosso, E., Bolognini, N., Vallar, G., &                  Structure and Function, 219, 1369-1383.
  Ursino, M. (2014). A neurocomputational analysis of the             Nath, A.R., & Beauchamp, M.S. (2012). A neural basis for
  sound-induced flash illusion. NeuroImage, 92, 248-266.                interindividual differences in the McGurk effect, a
Cuppini, C., Magosso, E., & Ursino, M. (2011).                          multisensory speech illusion. Neuroimage, 59, 781-787.
  Organization, maturation, and plasticity of multisensory            Patton, P.E., & Anastasio, T.J. (2003). Modeling cross-
  integration: insights from computational modeling                     modal enhancement and modality-specific suppression in
  studies. Frontiers in psychology, 2.                                  multisensory neurons. Neural computation, 15, 783-810.
Fallon, M., Trehub, S.E., & Schneider, B.A. (2000).                   Rogers, T.T., Lambon Ralph, M.A., Garrard, P., Bozeat, S.,
  Children’s perception of speech in multitalker babble. The            McClelland, J.L., Hodges, J.R., and Patterson, K. (2004).
  Journal of the Acoustical Society of America, 108(6),                 The structure and deterioration of semantic memory: A
  3023-3029.                                                            neuropsychological and computational investigation.
Foxe, J.J., Molholm, S., Del Bene, V.A., Frey, H. P., Russo,            Psychological Review, 111, 205-235.
  N.N., Blanco, D., Saint-Amour, D., & Ross, L.A. (2015).             Ursino, M., Cuppini, C., & Magosso, E. (2010). A
  Severe multisensory speech integration deficits in high-              computational model of the lexical semantic system based
  functioning school-aged children with autism spectrum                 on a grounded cognition approach. Frontiers in
  disorder (ASD) and their resolution during early                      Psychology, 1, 221.
  adolescence. Cerebral Cortex, 25(2), 298-312.                       Ursino, M., Cuppini, C., & Magosso, E. (2014).
Gerstner, W., & Kistler, W.M. (2002). Mathematical                      Neurocomputational        approaches     to     modelling
  formulations of Hebbian learning. Biological cybernetics,             multisensory integration in the brain: A review. Neural
  87(5-6), 404-415.                                                     Networks, 60, 141-165.
Jansen, B.H., & Rit, V.G. (1995). Electroencephalogram                Ursino, M., Cuppini, C., & Magosso, E. (2015). A neural
  and visual evoked potential generation in a mathematical              network for learning the meaning of objects and words
  model of coupled cortical columns. Biological                         from a featural representation. Neural Networks, 63, 234-
  cybernetics, 73(4), 357-366.                                          253.
Knill, D.C., & Pouget, A. (2004). The Bayesian brain: the             Ursino, M., Cuppini, C., Magosso, E., Serino, A., & Di
  role of uncertainty in neural coding and computation.                 Pellegrino, G. (2009). Multisensory integration in the
  Trends in Neurosciences, 27 (12), 712–719.                            superior colliculus: a neural network model. Journal of
                                                                        computational neuroscience, 26(1), 55-73.



                                                                394