<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>PLOS Computational Biology 16 (2020)
s41586-023-06668-3. e1007796. doi:10.1371/journal.pcbi.1007796.
[13] D. A. Hudson</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1371/journal.pcbi.1007796</article-id>
      <title-group>
        <article-title>A Survey of Brain-Inspired Mechanisms for Neuro-Symbolic Reasoning</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Florin Leon</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Gheorghe Asachi” Technical University of Iaşi</institution>
          ,
          <country country="RO">Romania</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>33</volume>
      <fpage>115</fpage>
      <lpage>121</lpage>
      <abstract>
        <p>Recent advances in Large Language Models have demonstrated that Transformer-based architectures can support symbolic-like reasoning without explicit symbolic formalisms. However, these models remain resource-intensive, opaque, and sometimes limited in systematic generalization and memory control. This paper reviews a set of biologically inspired mechanisms that may ofer more eficient and lfexible alternatives for implementing reasoning in neural systems, or even introduce new design principles. We explore models that address variable binding, compositionality, contextual inference and embedding, neuro-symbolic integration, and architectural designs inspired by neuroscience. They reveal how symbolic operations can emerge from continuous, distributed neural dynamics.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Neuro-symbolic methods</kwd>
        <kwd>brain-inspired mechanisms</kwd>
        <kwd>reasoning large language models (large reasoning models)</kwd>
        <kwd>survey</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>In early 2025, there was a significant shift in the landscape
of Artificial Intelligence (AI) when Large Language Models
(LLMs) capable of robust multi-step reasoning entered the
mainstream. These Transformer-based systems now
demonstrate emergent algorithmic behaviors in a wide range of
tasks, without explicit internal symbolic representations.
As a result, biologically inspired approaches to reasoning,
which once seemed essential for integrating neural and
symbolic systems, seem to have lost their sense of urgency and
even necessity.</p>
      <p>However, the eficiency and interpretability challenges
faced by LLMs raise new questions about the role that
neuroscience can still play in the design of reasoning systems.
Despite their impressive performance, current models
remain highly resource-intensive, as they require massive
datasets and extensive computation to achieve their results.
In contrast, biological systems perform coherent
compositional reasoning using far fewer resources. This motivates
a look at brain-inspired computation, not as an alternative
to Transformers, but as a potential source of principles for
improving their scalability, modularity, and reasoning
capabilities.</p>
      <p>Although many studies explore neuro-symbolic (NS)
methods, we focus on a smaller group of works that seem
especially relevant to improving reasoning in LLMs. Rather
than aiming for comprehensive coverage or classifying
existing hybrid architectures, these examples were chosen for
their potential to suggest concrete principles that could
inspire more eficient neural systems. Also, while logic has
long been considered the epitome of reasoning, and many
works combine neural architectures with logical inference,
we place less emphasis on these approaches in this survey.
This is not due to their lack of value, but because we view
logical reasoning not as an innate product of brain
function, but as a learned, constrained mode of thought that
appears under specific conditions. Instead, we analyze a
range of biologically inspired mechanisms that could inform
or complement reasoning in artificial systems, and illustrate
how symbolic-like behavior can emerge from dynamic,
distributed processes.</p>
      <p>
        Several recent works have also surveyed the growing field
of neuro-symbolic integration and brain-inspired AI. For
example, [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] presents a taxonomy of NS learning systems,
and emphasizes three paradigms: learning for reasoning,
reasoning for learning, and learning-reasoning. It covers
technical models and application domains, but its scope is
broad and not specifically tailored to reasoning methods.
Similarly, [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] ofers a wide-ranging overview of NSAI, with
a focus on representation, learning, and decision making for
ifelds such as robotics and healthcare. Another review [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]
explores NS approaches for Artificial Intelligence of Things
(AIoT) applications, while [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] discusses biologically inspired
strategies in the broader efort to realize Artificial General
Intelligence (AGI).
      </p>
      <p>This survey highlights five key aspects of reasoning in
neural systems that align with mechanisms observed in
biological cognition. These include the ability to maintain
variable identity and role assignment, the formation and
reuse of abstract functional components, the encoding of
context-sensitive information, the integration of symbolic
computation with neural dynamics, and the use of
biologically inspired memory and control architectures. Thus,
in the rest of the article, we analyze techniques relevant
for variable binding (Section 2), compositionality (Section
3), handling context and embeddings (Section 4),
neurosymbolic and hybrid systems (Section 5), and architectures
inspired by neuroscience (Section 6). This perspective allows
us to extract some insights from neuroscience and assess
their applicability to current or future reasoning models.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Variable Binding</title>
      <p>Understanding how variable binding is implemented in
neural systems, both artificial and biological, is central
to explaining and improving reasoning. We begin with
a study of LLMs, which already exhibit symbolic-like
behavior, and then explore progressively more biologically
grounded mechanisms that achieve similar goals.</p>
      <p>
        The manner in which Transformer-based LLMs perform
variable binding during in-context reasoning tasks is
analyzed in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The authors formalize the binding problem as
the need to associate each entity with the correct attribute
in multi-entity contexts. Through causal interventions, they
uncover a distributed mechanism in which “binding ID”
vectors are added to the base representations of entities and
attributes. These vectors reside in a continuous subspace,
are independent of token position, and allow LLMs to
perform queries with factorizable and position-invariant
behavior. The binding vectors transfer across tasks and models,
which suggests that LLMs implicitly learn a general-purpose
representational subspace for symbolic associations. This
mechanism reveals how standard Transformer architectures
can internally implement reasoning via learned, reusable
vector-based operations.
      </p>
      <p>
        Although Transformer LLMs can succeed in binding
using learned vector representations, their architecture lacks
explicit mechanisms for controlled storage, reuse, or
indirection. The next paper [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] addresses this limitation by
introducing a biologically inspired solution based on neural
gating and addressing. It presents a neurocomputational
model in which the prefrontal cortex (PFC) and basal
ganglia (BG) jointly support variable binding via indirection. In
this model, one PFC region or “stripe” encodes addresses or
pointers to information stored in other PFC regions. These
addresses are regulated by BG-mediated gating mechanisms,
which control when specific bindings are updated,
maintained, or used for output. Each stripe has a fixed identity
mapped onto a unique activation pattern, which enables
address-based routing without copying actual content. This
architecture enables flexible reuse of values for roles and
supports systematic generalization to new combinations
that are not encountered during training. It also illustrates
how indirection, an operation commonly used in computer
science, might be realized in biological circuits.
      </p>
      <p>
        While the PFC-BG model emphasizes dynamic routing
and gating at the systems level, the following paper [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]
explores how neural circuits might perform variable
binding through associations at the level of spiking neurons. It
models sparse neuron assemblies in the brain, which form
dynamically in “neural spaces” and serve as pointers to
concept assemblies located in a separate “content space”. These
assembly pointers form through fast Spike-Timing
Dependent Plasticity (STDP), triggered by transient disinhibition.
This mechanism supports symbolic-like operations such as
binding, copying, and equality checking, all implemented
without explicit symbolic representations.
      </p>
      <p>
        In contrast to pointer models, the next approach [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]
introduces a mechanism that supports symbolic binding through
memory segmentation within a unified autoassociative
network. It proposes Dynamically Partitionable
Autoassociative Neural Networks (DPAANNs) as a biologically
plausible architecture for solving the variable binding problem.
DPAANNs integrate a central attractor-based memory
system with dynamically segmentable bufers that represent
symbolic roles. These bufers can be quickly bound to values
by activating distinct subpopulations within the shared
autoassociative space. The model supports role-value
independence, allows compositional encoding and decoding, and
enables variable manipulation through bufer-specific
addressing, all using standard neural dynamics. Unlike anatomical
or synchrony-based binding approaches, DPAANNs ofer
lfexible, content-addressable binding without hardwiring or
oscillatory control. The model implements the core
mechanisms of ACT-R cognitive architecture [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] by using the
dynamically segmentable bufers as a biologically plausible
substrate for its global workspace, and remains grounded in
realistic assumptions about connectivity, attractor dynamics,
and Hebbian plasticity.
      </p>
      <p>
        While both indirection and pointer-based models
emphasize spatial representation, the final paper [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] addresses
a fundamentally diferent axis of computation: time. It
explores how dynamic, oscillation-based synchronization can
solve the binding problem with minimal structural overhead.
It proposes that variable binding in working memory can be
implemented through time-based synchronization, where
each role-value pair is encoded by neural activity aligned to
a distinct oscillatory phase. This temporal encoding enables
multiple bindings to coexist without interference and allows
rapid binding and unbinding. The model links memory
capacity to oscillatory frequency. Slower oscillations permit
more distinct bindings, while faster ones reduce capacity.
Unlike synaptic binding, this method avoids persistent
connections and supports flexible reuse. Simulations
demonstrate how phase separation can maintain multiple bindings
concurrently and explain capacity constraints in working
memory.
      </p>
      <p>Together, these models illustrate a diverse set of
mechanisms (vector-based addition, indirection, assembly
pointers, attractor-based partitioning, and oscillatory phases)
through which variable binding can be achieved. They may
ofer complementary strategies for improving
generalization, memory eficiency, and compositional reasoning in
future intelligent systems.</p>
      <sec id="sec-2-1">
        <title>Discussion</title>
        <p>The papers reviewed in this section address a core
challenge at the intersection of neural networks and symbolic
reasoning, i.e., how to represent and manipulate information
such as role-filler bindings or variable-value associations,
within distributed, trainable systems. The shared objective
is to discover mechanisms that allow neural models to
maintain the identity of variables across operations, dynamically
assign roles, and carry out functions like copying,
comparison, indirection, and substitution.</p>
        <p>A central insight in these works is that variable binding
need not rely on discrete symbols or fixed architecture.
Instead, it can emerge from learned dynamics over continuous
representations. Several models achieve this by
introducing neural forms of indirection, where patterns of activity
act as pointers to other representations. These pointers
can be created, reused, and recomposed dynamically, which
enables systematic generalization and flexible memory
access. Other approaches use oscillatory or time-based
mechanisms to encode multiple bindings in parallel, which allows
variable-role pairs to be represented and disentangled based
on temporal phase. Still others rely on attractor-based
partitioning or topologically organized bufers that support
role-specific storage and recombination.</p>
        <p>From the perspective of LLMs, these insights suggest
possible directions for architectural and training
innovations. For instance, the addition of explicit binding
subspaces, similar to learned role vectors or binding identities,
could improve the handling of coreference, substitution,
and memory-intensive tasks. Indirection and pointer-based
mechanisms could allow models to learn internal memory
protocols that support variable reuse in diferent contexts,
while time-based activation patterns could facilitate
dynamic binding and unbinding without overwriting content.
Such additions could enhance the ability to learn and apply
reusable abstractions, improve interpretability, and support
systematic generalization.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Compositionality</title>
      <p>
        To see how modern neural models could acquire and
generalize compositional structures, we begin this section with
some evidence that standard networks may develop
modular internal subroutines spontaneously, and then trace a
path through increasingly explicit architectural and learning
designs that aim to support systematic generalization [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
The authors define a notion of compositionality and apply
model pruning techniques with continuous sparsification to
isolate subnetworks responsible for individual subfunctions.
These include tasks such as detecting spatial relations in
vision or subject-verb agreement in language. They find
that, across architectures and domains, subnetworks often
encode distinct, functionally specialized components. These
subnetworks can be ablated to selectively disrupt one
function without impairing others, which ofers evidence for
modular task decomposition. Self-supervised pretraining
increases the clarity and consistency of this modular
organization, especially in language models. The study suggests
that gradient-based learning alone can produce
compositional representations under the right conditions.
      </p>
      <p>
        The next study [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] directly optimizes for compositional
behavior. It shows how training procedures, rather than
architectures alone, can induce systematic generalization.
Meta-Learning for Compositionality (MLC) is a method that
trains a standard Transformer to acquire human-like
compositional generalization. This is based on a large number
of short training episodes that involve learning a small
artificial language of invented words. In each episode, a few
examples of input-output word pairs are presented where,
e.g., “dax” refers to a red circle and “fep” means to repeat
the previous word three times. Afterwards, the model is
required to generate outputs for new combinations that were
not present during training. Through meta-learning, the
Transformer becomes proficient at inferring latent
grammars from limited data and applying learned rules to new
combinations. The resulting model exhibits both flexibility
and systematicity, outperforms standard neural networks,
and matches human generalization patterns. It also
replicates human-like errors, which reveals similar inductive
biases. This result shows that compositional reasoning can
emerge in standard architectures when training explicitly
promotes the development of compositionality.
      </p>
      <p>While MLC encourages systematicity through task
distribution, the next approach [13] focuses on architectural
inductive bias. It aims to make compositional reasoning
an explicit part of the internal operations of the network.
For this purpose, it proposes the Memory, Attention, and
Composition (MAC) network, a fully diferentiable
architecture for visual reasoning. Each MAC cell maintains separate
control and memory states to represent the current
reasoning goal and the intermediate result. The model answers
questions by dividing them into a sequence of discrete steps,
where each cell selects a relevant part of the question,
retrieves information from the image, and updates memory
accordingly. The architecture uses soft attention and gated
updates to support flexible yet interpretable reasoning over
both language and vision. This design imposes a prior that
enforces stepwise reasoning. Thus, the network achieves
high accuracy and interpretability on the CLEVR benchmark
[14] without explicit supervision.</p>
      <p>Whereas MAC enforces compositional reasoning through
diferentiable architectural constraints alone, the next work
[15] introduces the Neural-Symbolic Stack Machine (NeSS),
which embeds a discrete stack machine inside a neural
controller, so compositionality stems from both explicit
symbolic recursion and architectural design. The symbolic
component supports recursive operations such as “push”, “pop”,
and “reduce”, while the neural network learns to produce
execution traces without trace-level supervision. The model
generalizes well on multiple benchmarks, including SCAN
[16] and few-shot language tasks, and achieves perfect
generalization. A key innovation is the notion of operational
equivalence, which enables the model to infer
compositional categories by grouping functionally similar
expressions. This integration of symbolic components with neural
learning shows that deep networks can acquire abstract,
rule-like behavior when given the appropriate inductive
tools and execution model.</p>
      <p>Finally, paper [17] introduces the Relation Network (RN),
a module designed to perform relational reasoning by
explicitly computing pairwise relations among entities. An
RN computes a function over all object pairs and aggregates
these outputs to support downstream inference. RNs excel
in visual and text-based tasks that require the
understanding of inter-object relationships, such as comparing object
attributes or predicting physical interactions. They
outperform standard neural architectures on relational tasks,
particularly on the CLEVR dataset, and do so without
relying on symbolic inputs or supervision. By enforcing a
relational inductive bias at the module level, RNs provide a
plug-and-play mechanism for embedding reasoning within
otherwise generic networks.</p>
      <sec id="sec-3-1">
        <title>Discussion</title>
        <p>These papers show that compositionality can emerge
in neural systems through architectural constraints,
metalearning regimes, or task designs that implicitly favor
modularity. The surveyed approaches difer in their methods,
benefits, and unresolved issues. One line of work uses
sparsification to isolate subnetworks linked to specific
subfunctions such as counting or comparison. These components
remain functional under ablation, which shows that
modular computation can emerge without explicit supervision.
However, this discovery happens after training, while an
important question is how to encourage persistent functional
separation during learning. Meta-learning methods like
MLC expose a Transformer to short synthetic episodes that
require rule abstraction and reuse. This setup enables strong
generalization and is also able to replicate human error
patterns. Still, the approach depends on many generated tasks
and grammars, which limits its scalability. Applying similar
ideas to real-world data remains an open issue.
Architectural designs such as the MAC model use stepwise attention
and memory operations controlled by separate units. These
models generalize well on visual tasks and produce
interpretable intermediate results, but their sequential nature
may reduce eficiency on long text inputs. Hybrid systems
such as NeSS combine neural networks with recursive
symbolic operations. They generalize on tasks like SCAN and
program induction but rely on curriculum learning and
predefined operations. Relation Networks introduce relational
modules that compare object pairs. They perform well on
relational tasks but do not address recursion or variable
binding.</p>
        <p>For LLMs, these insights imply that generalization and
rule application could benefit from discrete symbolic
modules. Such capabilities could emerge more reliably if LLMs
were trained under regimes that induce abstraction, for
example, using episodic meta-learning, architectural
modularity, or targeted pruning techniques. They also suggest that
interventions at the representation level, e.g., to isolate
functional subspaces or inject compositional operators, could
enhance interpretability and robustness, particularly for
multistep reasoning tasks. These results challenge the assumption
that Transformers are inherently non-compositional and
ofer mechanisms to steer their internal dynamics toward
more generalizable computation.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Contextual Inference and</title>
    </sec>
    <sec id="sec-5">
      <title>Embedding Techniques</title>
      <p>Understanding how neural systems represent context
requires models that combine representation learning with
memory and inference. This section reviews biologically
inspired approaches that address these challenges through
various embedding techniques, contextual inference
frameworks, and representational models that support abstraction
and relational reasoning. We begin with models that
generate eficient embeddings from sparse neural codes, and move
toward those that organize memory and support flexible
navigation in semantic and conceptual spaces.</p>
      <p>A biologically inspired model for word embeddings, based
on the architecture of the mushroom body of the olfactory
system of the fruit fly, is presented in [ 18]. This network uses
sparse, competitive dynamics to transform word-context
pairs into binary codes, with global inhibition based on a
winners-take-all (kWTA) mechanism. Learning is driven by
projecting word-context pairs into high-dimensional sparse
representations, where each word activates a random subset
of neurons within the context vector space. This
mechanism, modulated by global inhibition and word rarity,
enables the system to capture semantic similarity and
contextsensitive word senses. These sparse embeddings, i.e., the
FlyVec model, support tasks such as word-sense
disambiguation and document classification, with high computational
eficiency.</p>
      <p>Building on this foundation, some follow-up studies
reifne the approach and extend its applicability. One of them
[19] introduces a continual learning rule that updates only
synapses from the top active units to the target class output,
leaving all others frozen. This sparsity fixes the number
of synapses updated per example and limits interference.
Another work [20] extends the model from words to full
sentences, i.e., the Comply method, by encoding word positions
as complex phases and learning a single complex-valued
parameter matrix. A kWTA step then produces compact,
interpretable binary sentence embeddings that preserve both
semantics and word order.</p>
      <p>While the fly-inspired models capture context through
local coding, the COIN framework [21] introduces an
approach grounded in Bayesian inference. It posits that the
brain maintains multiple latent context representations that
guide memory creation, expression, and updating. Contexts
are inferred probabilistically from sensory cues, feedback,
and time, rather than through direct observation. The model
accounts for classical conditioning, episodic recall, decision
making, and motor learning. It also distinguishes between
proper learning, which involves memory updates, and
apparent learning, which consists of adjustments to context
beliefs. The COIN model updates each memory in direct
proportion to the inferred probability of its context. This
approach ensures that learning reflects the level of uncertainty
about which context is active.</p>
      <p>Neuroscience research suggests that the hippocampus
supports context-dependent representation and reasoning
by organizing knowledge into structured internal spaces.
Mechanisms such as cognitive maps (representations of
relationships between states or concepts), successor
representations (encodings of expected future state occupancy), and
grid cells (neurons that exhibit periodic spatial firing
patterns and help to pinpoint the current location) ofer models
for constructing embeddings and inference procedures.</p>
      <p>Although originally studied for spatial navigation, these
mechanisms also appear to support conceptual reasoning.
Empirical evidence that the brain reuses spatial navigation
mechanisms for abstract reasoning is provided in [22].
Human participants who had to learn a conceptual space
deifned by visual features of bird images exhibited grid-like
activation patterns in the entorhinal cortex, analogous to
those observed during physical navigation. This supports
the idea that the brain encodes conceptual relationships
using spatial structure, and suggests a shared computational
substrate for spatial and non-spatial inference.</p>
      <p>A successor representation (SR) is introduced in [23],
where each state is encoded by the expected future
occupancy of other states under a policy. This predictive model
separates transition dynamics from goals and supports
eficient updates to value functions. While originally applied
to spatial navigation, SRs also provide a general framework
for organizing relational knowledge; this idea is applied to
semantic memory in [24]. A neural network encodes animal
species using handcrafted features and learns a cognitive
map based on expected feature-based similarity transitions.
Varying the SR discount factor produces coarse or fine
conceptual groupings, e.g., insects vs. mammals, and the model
interpolates between known animals to classify new or
incomplete inputs. This supports the use of predictive
relational maps for abstract conceptual knowledge in order to
generalize beyond training data.</p>
      <p>A model that allows standard 2D grid cells to encode
high-dimensional variables is proposed in [25]. The system
uses random linear projections to embed high-dimensional
inputs into periodic activity patterns across multiple grid
modules. This mixed modular code enables linear decoding
of positions in abstract vector spaces and supports multiple
variable types without modifying the network architecture.
It resolves the dimensionality bottleneck in grid codes by
preserving pairwise relations and scaling to arbitrarily many
dimensions.</p>
      <p>The authors of [26] suggest that both spatial and
conceptual representations arise in fact from a general clustering
mechanism, where grid-like patterns in navigation tasks
emerge from uniform sampling, while conceptual clusters
reflect semantic similarity.</p>
      <sec id="sec-5-1">
        <title>Discussion</title>
        <p>These papers highlight how context-sensitive
embeddings that support abstraction, generalization, and memory
in neural systems can emerge from predictive, probabilistic,
or geometric codes. An important idea is that the brain
infers latent contexts to determine when to store, retrieve,
or update information. This allows it to handle
discontinuities in experience without overwriting past knowledge, a
property essential for continual learning. SRs and grid-like
codes provide compact embeddings that encode relations
and support flexible planning and classification, even in
conceptual domains.</p>
        <p>These principles can also suggest some ways to extend
LLMs. A model that infers latent context could decide which
information to retrieve or update based on changes in input
or task. Spatial coding schemes, such as SRs or grid-based
encodings, could replace fixed position embeddings and
organize concepts based on relational patterns. This may help
models to recognize analogies or reuse knowledge across
tasks. Sparse and competitive embedding mechanisms could
increase memory eficiency and allow the representation of
multiple meanings of a word or concept, depending on
context. These models propose biologically grounded strategies
for encoding information in ways that support robust
reasoning and generalization for diferent tasks, challenges that
remain essential to scaling and extending LLM capabilities.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Neuro-Symbolic and Hybrid</title>
    </sec>
    <sec id="sec-7">
      <title>Systems</title>
      <p>This section presents some developments that bridge neural
learning with symbolic or algorithmic methods. We begin
with models that couple neural controllers with
diferentiable memory, then explore systems that integrate symbolic
reasoning more explicitly, and close the section with
architectures inspired by biological memory systems.</p>
      <p>The Diferentiable Neural Computer (DNC) [ 27] ofers an
influential example of neuro-symbolic integration.
Building on the earlier Neural Turing Machine (NTM) model
[28], DNC augments a recurrent neural network with a
differentiable external memory matrix. This setup allows the
model to perform operations that resemble reading and
writing variables in traditional computing. The DNC controller
learns to access memory through soft attention mechanisms
that include content-based retrieval, temporal linking
(storing the order of writes in a temporal link matrix), and
usagebased allocation (tracking memory usage to guide writes to
unused locations). These mechanisms give the model the
ability to construct and manipulate data structures such as
lists, trees, and graphs. As a result, the DNC can handle
tasks like pathfinding, graph inference, or array sorting.
Moreover, it can generalize for variable-length inputs and
can perform operations that resemble classical procedural
logic within a fully diferentiable framework. Unlike classic
neural networks that entangle memory with computation,
the DNC separates memory and control, which enables
behavior that resembles algorithmic processing.</p>
      <p>While the DNC emphasizes a general purpose external
memory, the Neuro-Symbolic Concept Learner [29]
introduces a modular design and symbolic reasoning into visual
question answering. The model decomposes the learning
process into three components: a visual perception
module that extracts object-level features, a semantic parser
that translates natural language questions into symbolic
programs, and a program executor that interprets these
programs on the scene. All modules are jointly trained from
image–question–answer tuples, and do not require
annotated object labels. Visual attributes are modeled as neural
operators that map object embeddings into interpretable
concept spaces (e.g., shape, color), while symbolic programs
capture compositional logic through executable sequences.
This architecture enables generalization to new object
conifgurations, new visual domains, and longer queries.</p>
      <p>The previous models embed symbolic control directly
within the architecture. The next approach focuses on
teaching neural networks to imitate the stepwise behavior of
classical algorithms. In the Neural Algorithmic Reasoning
framework [30], neural networks are trained to emulate
traditional algorithms, such as Dijkstra’s shortest path and
value iteration. The learning process proceeds in stages. A
neural processor first learns algorithmic steps by training
on low-dimensional abstract inputs. Then, encoder and
decoder networks transform real-world inputs into and out of
the latent space of the processor. This separation allows the
model to preserve algorithmic invariants while adapting to
noisy, high-dimensional data. It addresses the algorithmic
bottleneck, i.e., the problem of compressing complex
realworld inputs into low-dimensional representations required
by traditional algorithms. This proves to be especially
powerful in reinforcement learning tasks, where latent planning
through algorithmic modules improves performance in
complex or partially observed environments.</p>
      <p>The Hint-ReLIC method [31] improves the generalization
of neural algorithmic models by incorporating causal
regularization. The authors notice that many diferent inputs can
lead to identical intermediate computations in algorithms.
Based on this, they propose a self-supervised contrastive
learning objective that encourages graph neural networks
to produce similar internal representations for such inputs.
Using a causal graph to formalize this invariance, the model
generates augmented examples that preserve execution
trajectories, and enforce stepwise consistency for diferent
variants. This improves out-of-distribution performance on
algorithmic reasoning benchmarks such as CLRS-30 [32],
particularly for sorting and graph tasks.</p>
      <p>The Shared Dual Memory Transformer (SDMTR) [33]
modifies the standard Transformer architecture by
replacing self-attention with a memory-based system inspired by
the brain. The model introduces two shared memory
components: a workspace that acts like working memory, and
a long-term memory (LTM) that stores useful information
across layers. At each layer, input tokens (that act as
independent modules, in fact, units) compete to write to the
workspace using sparse attention. Only the most relevant
tokens are allowed to update the workspace. The workspace
is then broadcast back to all tokens to guide further
processing. Important workspace content is stored in LTM using
outer product attention, which allows the model to build
high-capacity memory representations. During inference,
the model retrieves relevant information from LTM to guide
token updates. This design supports iterative reasoning
and helps the model to generalize better on relational tasks.
SDMTR is reported to outperform standard Transformers
on benchmarks such as bAbI [34], Sort-of-CLEVR [35], and
the “triangle detection” visual reasoning task [36].</p>
      <sec id="sec-7-1">
        <title>Discussion</title>
        <p>Neuro-symbolic and hybrid models approach reasoning
with diferent strengths and limitations. The Diferentiable
Neural Computer augments a learned controller with an
external memory that supports content-based lookup,
temporal linking, and dynamic allocation. It enables scalable
memory use and variable-like access, and performs well
on synthetic question-answering and graph tasks.
However, training is computationally expensive, attention cost
increases with memory size, and stability with very large
memories remains unresolved. Hint-ReLIC improves
neural algorithmic reasoning by aligning hidden states across
identical intermediate steps in an algorithm. It enforces
invariance when diferent inputs imply the same next step,
which boosts out-of-distribution accuracy on CLRS
benchmarks. However, the approach depends on access to
algorithm trajectories or manually generated hints, which
may limit its applicability to real-world data. The
NeuroSymbolic Concept Learner combines object detection,
parsing, and a symbolic program executor, and achieves high
accuracy on CLEVR using natural supervision. Yet, it
relies on curriculum learning and clean object masks, which
makes transfer to cluttered or ambiguous scenes an open
problem. The Shared Dual-Memory Transformer introduces
a competition-based workspace and a long-term memory. It
reports good results on reasoning benchmarks and enables
memory visualization. Still, it can have high computation
costs and lacks evaluation on long language tasks or noisy
inputs.</p>
        <p>For Transformer-based LLMs, these ideas provide a
blueprint for enhancing compositional generalization,
especially in domains that require logic, recursion, or algorithmic
manipulation. Incorporating symbolic intermediates, such
as learned execution traces or memory read-write patterns,
can make reasoning processes more interpretable and
modular. Hybrid systems also ofer a way to disentangle content
(e.g., object or entity representations) from operations (e.g.,
comparison or traversal), which reduces the burden on
attention mechanisms to simulate algorithmic flow. These
models may suggest that carefully constrained
hybridization and not just greater scale may be necessary, or at least
helpful, to achieve robust reasoning in LLMs.</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>6. Neuroscience-Inspired</title>
    </sec>
    <sec id="sec-9">
      <title>Architectures</title>
      <p>This section presents some models that use inspiration from
the hippocampus, prefrontal cortex, and broader cortical
dynamics to build architectures capable of generalization,
composition, and memory manipulation that can support
forms of reasoning relevant to AI.</p>
      <p>The Tolman-Eichenbaum Machine (TEM) [37] provides a
unified model for how the hippocampus supports both
spatial navigation and relational reasoning. It represents tasks
as graph-based transitions and explains how the
hippocampus links relational codes (encoded by grid-like
representations) with sensory content using fast Hebbian learning.</p>
      <p>This separation of relational patterns and sensory detail
enables the model to generalize across tasks, transfer
knowledge to new environments, and produce consistent
remapping patterns. TEM accounts for a wide range of neural
responses observed in neuroscience experiments.</p>
      <p>Building on this foundation, the TEM-t model [38]
reframes the components of TEM using a Transformer
architecture. The authors introduce a modified Transformer
that uses recurrent positional encodings and causal
attention. They show that it can reproduce spatial cell types and
memory behaviors similar to the hippocampus. A formal
equivalence is proven between Transformer attention and
Hebbian memory retrieval mechanisms, which suggests that
brain-like architectures and modern Transformer models
share underlying computational principles. This
formulation supports the idea that Transformer-based systems could
benefit from models of hippocampal function and ofers a
framework for integrating relational memory with language
and abstract reasoning.</p>
      <p>While TEM is designed to support generalization through
relational mapping across spatial and task-based contexts,
the model in [39] proposes that hippocampal replay
supports compositional generation. Replay refers to the brain’s
reactivation of neural sequences during rest or planning,
often at accelerated timescales. The paper argues that these
sequences do not merely reflect past experiences but can
combine entities and roles, such as “verb” or “start point”, to
build new representations. The authors suggest that replay
combines role-bound elements into new configurations, and
allows the system to infer facts that were never explicitly
learned. They present experimental evidence that replay
can generate unexperienced sequences, support abstract
reasoning, and construct compound knowledge. These
insights recast replay not only as a memory mechanism but
as a computational resource for relational and symbolic
inference.</p>
      <p>Working memory frameworks are extended in [40] by
introducing adaptive chunking as a learned strategy for
managing the tradeof between memory precision and
capacity. It uses reinforcement signals to decide when to store
raw inputs and when to merge similar items into shared
representations, depending on task demands. This chunking
mechanism allows the network to store more information
with fewer resources while accepting a controlled loss in
precision. The model accounts for recency bias, i.e., the
improved recall of recent items, as a consequence of
selectively updating or replacing earlier representations. It
also explains diferential chunking, where the likelihood of
merging items increases with their similarity and decreases
with the number of items.</p>
      <p>The model in [41] simulates how semantic knowledge
can emerge from word learning grounded in sensory and
motor experience. It uses a spiking neural network with
biologically plausible connectivity and Hebbian learning to
associate spoken words with perceptual and action-based
features. These associations produce distributed neural cell
assemblies that integrate phonological, visual, and motor
information without requiring labeled data or external
supervision. Words for objects activate vision-related circuits,
while action words activate motor-related circuits. These
category-specific activations converge in multimodal
regions that function as semantic hubs. The model shows how
semantic representations can self-organize from repeated
exposure to co-occurring patterns of sound and
sensorimotor input, and it explains both the emergence of
modalityspecific word meanings and shared multimodal
representations based on the architecture of the network and learning
dynamics.</p>
      <p>The Semantic Pointer Architecture (SPA) [42] is a
cognitive architecture that aims to explain how high-level
behavior can arise from low-level neural interactions. The
modeling starts with neurons with tuning curves, where
each neuron responds more or less strongly depending on
the input values, and uses these responses to encode
vectors (similar to the activations of neural populations or cell
assemblies) and apply transformations [43]. SPA uses these
neural building blocks to create high-level representations
called semantic pointers, i.e., fixed-size vectors that can
combine concepts using circular convolution, and later recover
their parts. This allows a neural system to represent rules,
memories, and sequences in a way that supports reasoning
and control. The architecture was used to create a
computational model with millions of spiking neurons capable
of visual processing and action planning, e.g., recognizing
digits and remembering lists, counting, answering
questions, drawing, and solving psychological tasks like Raven’s
matrices [44].</p>
      <p>Ideas for neural reasoning systems
- A dedicated role subspace in token embeddings could help track the roles of variables and preserve their
identity across long contexts
- Address-based indexing could allow attention to target specific memory locations and reuse intermediate
results, like pointers
- Sparse attention could help retrieve information by similarity with fewer activated elements, and more
eficiency and interpretability
- Dividing the residual stream into multiple task-specific branches that activate based on context could help
hold several role-value pairs simultaneously without interference
- Fast Hebbian plasticity could form temporary role-value bindings that dissolve after a reasoning step
- Phase-based modulation could help maintain multiple bindings in parallel, separated by timing – or simulated
using, e.g., learned sinusoidal gates
- Sparsity or communication constraints could promote functional modules that remain interpretable and
easy to update
- Episodic meta-learning with few-shot tasks could support rule discovery and reuse in diferent domains
- A (possibly latent) neural stack could support step-by-step composition, recursion, and hierarchical input
processing
- Relational attention with pairwise comparisons or edge labels could represent token relationships directly
and improve judgments of equality, order, and grouping
- Sparse binary embeddings could reduce memory use while keeping concept meanings distinct
- Embeddings inspired by successor representation could improve next token prediction by modeling likely
future states
- Complex-valued position encodings with phase information could unify order and meaning in a single
operation
- Position encodings inspired by grid cells could capture conceptual distance more efectively
- Direction vectors in a learned relational map could support movement between related concepts
- Freezing inactive parameters during continual training could preserve existing knowledge while integrating
new information
- A trainable controller with diferentiable external memory could manage storage and retrieval of intermediate
results for long reasoning tasks
- A shared workspace memory could allow tokens to compete for relevance and promote the most important
ones to longer-term memory
- Generating soft programs or reasoning graphs could break complex queries into clear, step-by-step operations
- Consistency constraints during training could help produce similar internal states for logically equivalent
inputs
- Training using contrastive examples with irrelevant input variations could help identify the relevant
information that drives the next reasoning step
- A recurrent mechanism that tracks positional change over time could replace the static sinusoidal positional
encoding used in standard Transformers and provide a more dynamic, context-aware sense of sequence
- Letting tokens compete for a small number of write locations at each layer could form stable representations
of important ideas
- A chunking mechanism could merge memory items based on similarity and usage to improve memory
eficiency
- A replay mechanism could retrieve earlier sequences and replace entities in them to support counterfactual
and imaginative reasoning
- Multimodal training data could ground language in perception and action, which may improve transfer
learning</p>
      <sec id="sec-9-1">
        <title>Discussion</title>
        <p>These models highlight how biologically inspired
mechanisms, such as information replay, chunking, or multimodal
grounding, can inform the design of neural architectures
with enhanced reasoning capabilities. A recurring theme is
the separation of abstract relational patterns from episodic
content, implemented through distinct coding strategies
or dynamic binding operations. This stands in contrast to
current LLMs, which merge function and content within
a single representation space. Mechanisms like fast
Hebbian learning and adaptive chunking could ofer eficient
means to encode, update, and reuse relational information
without retraining, a feature that could enable more
sampleeficient reasoning in LLMs. Replay-based models show
that sequence generation need not rely on sampling from
static representations. Instead, reasoning can emerge from
compositional recombination of role-bound elements, a
process more similar to planning than retrieval. Integrating
this with LLMs could improve zero-shot generalization and
support multi-step inference through internal simulation
rather than token prediction alone.</p>
        <p>The use of competitive memory systems and
tasksensitive gating illustrates how biological networks manage
precision-capacity tradeofs. These mechanisms could
inspire selective memory routing in LLMs, which would
enable models to preserve important relational patterns while
compressing redundant inputs. Likewise, the emergence
of cell assemblies in semantic grounding models shows
how distributed representations can self-organize to reflect
shared abstractions for diferent modalities. This points to
opportunities for LLMs to learn grounded semantic
representations through unsupervised multimodal training.</p>
      </sec>
    </sec>
    <sec id="sec-10">
      <title>7. Conclusions</title>
      <p>This survey has examined a range of brain-inspired
mechanisms that can ofer insights into how symbolic-like
reasoning can be implemented within neural systems. While
recent advances in LLMs have brought multi-step reasoning
into the mainstream, they have come at significant
computational cost and with limited interpretability. In contrast,
biological systems perform compositional, context-sensitive
reasoning with remarkable eficiency and generalization
capabilities, which may motivate the continued exploration of
neural principles that could inform artificial architectures.</p>
      <p>To summarize the main insights of the survey, Table 1
presents several biologically inspired ideas drawn from the
ifve categories discussed above. For each category, it
highlights specific mechanisms that could improve reasoning in
neural architectures such as LLMs. These ideas aim to
support capabilities such as variable tracking, memory control,
compositional processing, and relational representations.</p>
      <p>Looking forward, such strategies could ofer promising
paths for advancing neuro-symbolic reasoning. Sparse
attention mechanisms, guided by embedding similarity, could
help retrieve relevant information while limiting
interference from irrelevant content. Using separate residual
channels for diferent roles could allow multiple role-value pairs
to coexist without conflict in a single forward pass. Fast
Hebbian updates could form transient links that support
step-bystep inference without altering long-term weights. Methods
inspired by time-based neural oscillations could assign
distinct phases to diferent bindings, reducing crosstalk during
multi-variable reasoning. Training regimes that promote
modularity, such as sparsity and pruning, could yield more
stable, interpretable components. Episodic meta-learning
could foster the discovery of reusable rules, while relational
attention could improve performance on tasks involving
equality, order, and grouping. Positional encodings inspired
by cognitive maps, like successor representations or
gridlike embeddings, could give the model a sense of spatial
or sequential distance. Replay and chunking mechanisms
could enhance memory eficiency and support the
refinement of longer reasoning sequences.</p>
      <p>Among these, we consider three ideas to be especially
important. First, inducing explicit role representations,
possibly directly in the embedding space, could let the model
assign a stable placeholder to an entity in a reasoning task.
The model could then replace that placeholder with diferent
values as needed, just as variable substitution works in logic,
perhaps by means of fast plasticity. Second, indirection via
pointer-like mechanisms could keep roles and values
separate, which would allow for flexible recombination and
improved generalization. It could even enable creative reuse
of familiar elements in new ways. Third, a neural controller
and an external symbolic component, such as a memory or a
stack, could enable algorithmic, recursive, and hierarchical
operations without entangling function with content. This
separation could support better interpretability, more
reliable handling of long or nested problems, and stable editing
of stored information.</p>
      <p>As research progresses, incorporating such design
principles into neural reasoning systems may lead to architectures
that generalize better, reason more transparently, and
operate with greater eficiency, moving closer to the adaptability
and robustness of human cognition.</p>
    </sec>
    <sec id="sec-11">
      <title>Acknowledgments</title>
      <p>This research was supported by the project “Romanian Hub
for Artificial Intelligence - HRIA”, Smart Growth,
Digitization and Financial Instruments Program, 2021-2027,
MySMIS no. 351416.</p>
    </sec>
    <sec id="sec-12">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the author used
ChatGPT 4o for grammar and spelling check. After using this
tool, the author reviewed and edited the content as needed
and takes full responsibility for the publication’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <article-title>A survey on neural-symbolic learning systems</article-title>
          ,
          <source>Neural Networks</source>
          <volume>166</volume>
          (
          <year>2023</year>
          )
          <fpage>105</fpage>
          -
          <lpage>126</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.neunet.
          <year>2023</year>
          .
          <volume>06</volume>
          .028.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>B. P.</given-names>
            <surname>Bhuyan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ramdane-Cherif</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Tomar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. P.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <article-title>Neuro-symbolic artificial intelligence: A survey</article-title>
          ,
          <source>Neural Computing and Applications</source>
          <volume>36</volume>
          (
          <year>2024</year>
          )
          <fpage>12809</fpage>
          -
          <lpage>12844</lpage>
          . doi:
          <volume>10</volume>
          .1007/s00521-024-09960-z.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Lu</surname>
          </string-name>
          , I. Afridi,
          <string-name>
            <given-names>H. J.</given-names>
            <surname>Kang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Ruchkin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <article-title>Surveying neuro-symbolic approaches for reliable artificial intelligence of things</article-title>
          ,
          <source>Journal of Reliable Intelligent Environments</source>
          <volume>10</volume>
          (
          <year>2024</year>
          )
          <fpage>257</fpage>
          -
          <lpage>279</lpage>
          . doi:
          <volume>10</volume>
          .1007/ s40860-024-00231-1.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>F.</given-names>
            <surname>Leon</surname>
          </string-name>
          ,
          <article-title>A review of findings from neuroscience and cognitive psychology as possible inspiration for the path</article-title>
          to artificial
          <source>general intelligence</source>
          ,
          <year>2024</year>
          . URL: https: //arxiv.org/abs/2401.10904. arXiv:
          <volume>2401</volume>
          .
          <fpage>10904</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Steinhardt</surname>
          </string-name>
          ,
          <article-title>How do language models bind entities in context?</article-title>
          ,
          <source>in: Proceedings of the International Conference on Learning Representations (ICLR)</source>
          ,
          <year>2024</year>
          . URL: https://openreview.net/forum?id=zb3b6oKO77.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>T.</given-names>
            <surname>Kriete</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. C.</given-names>
            <surname>Noelle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Cohen</surname>
          </string-name>
          ,
          <string-name>
            <surname>R. C.</surname>
          </string-name>
          <article-title>O'Reilly, Indirection and symbol-like processing in the prefrontal cortex and basal ganglia</article-title>
          ,
          <source>Proceedings of the National Academy of Sciences</source>
          <volume>110</volume>
          (
          <year>2013</year>
          )
          <fpage>16390</fpage>
          -
          <lpage>16395</lpage>
          . doi:
          <volume>10</volume>
          .1073/pnas.1303547110.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>R.</given-names>
            <surname>Legenstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. H.</given-names>
            <surname>Papadimitriou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Vempala</surname>
          </string-name>
          , W. Maass,
          <article-title>Assembly pointers for variable binding in networks of spiking neurons</article-title>
          ,
          <year>2016</year>
          . URL: https: //arxiv.org/abs/1611.03698. arXiv:
          <volume>1611</volume>
          .
          <fpage>03698</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>K. J.</given-names>
            <surname>Hayworth</surname>
          </string-name>
          ,
          <article-title>Dynamically partitionable autoassociative networks as a solution to the neural binding problem</article-title>
          ,
          <source>Frontiers in Computational Neuroscience</source>
          <volume>6</volume>
          (
          <year>2012</year>
          )
          <article-title>73</article-title>
          . doi:
          <volume>10</volume>
          .3389/fncom.
          <year>2012</year>
          .
          <volume>00073</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Anderson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lebiere</surname>
          </string-name>
          , The Atomic Components of Thought, Lawrence Erlbaum Associates,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Senoussi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Verbeke</surname>
          </string-name>
          , T. Verguts,
          <article-title>Time-based binding as a solution to and a limitation for flexible cognition</article-title>
          ,
          <source>Frontiers in Psychology</source>
          <volume>12</volume>
          (
          <year>2022</year>
          )
          <article-title>798061</article-title>
          . doi:
          <volume>10</volume>
          .3389/fpsyg.
          <year>2021</year>
          .
          <volume>798061</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Lepori</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Serre</surname>
          </string-name>
          , E. Pavlick,
          <article-title>Break it down: Evidence for structural compositionality in neural networks</article-title>
          ,
          <source>in: Advances in Neural Information Processing Systems</source>
          , volume
          <volume>36</volume>
          ,
          <year>2023</year>
          . URL: https://openreview. net/forum?id=rwbzMiuFQl.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>B. M.</given-names>
            <surname>Lake</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Baroni</surname>
          </string-name>
          ,
          <article-title>Human-like systematic generalization through a meta-learning neural net-</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>