=Paper=
{{Paper
|id=Vol-1419/paper0092
|storemode=property
|title=A Vector Representation of Fluid Construction Grammar Using Holographic Reduced Representations
|pdfUrl=https://ceur-ws.org/Vol-1419/paper0092.pdf
|volume=Vol-1419
|dblpUrl=https://dblp.org/rec/conf/eapcogsci/KnightSS15
}}
==A Vector Representation of Fluid Construction Grammar Using Holographic Reduced Representations==
A vector representation of Fluid Construction Grammar
using Holographic Reduced Representations
Yana Knight1 and
Michael Spranger2 and
Luc Steels1
1 Artificial Intelligence Laboratory,
Free University of Brussels (VUB)
Pleinlaan 2, 1050 Brussels, Belgium
Dr. Aiguader 88, Barcelona 08003, Spain
2 Sony Computer Science Laboratories
3-14-13 Higashigotanda, 141-0022 Tokyo, Japan
Abstract that the symbolic approach is unable to deal with learning
The question of how symbol systems can be instantiated in and grounding, but this criticism often ignores work within
neural network-like computation is still open. Many technical the large field of (symbolic) machine learning and work on
challenges remain and most proposals do not scale up to realis- grounding symbolic representations in perception and action
tic examples of symbol processing, for example, language un-
derstanding or language production. Here we use a top-down by physical robots. While the numeric approach has proven
approach. We start from Fluid Construction Grammar, a well- its worth in the domains of pattern recognition which includes
worked out framework for language processing that is compat- feature extraction, category formation, and pattern detection,
ible with recent insights into Construction Grammar and inves-
tigate how we could build a neural compiler that automatically it has not been equally successful in the implementation of
translates grammatical constructions and grammatical process- ‘true’ physical symbol systems (Newell & Simon, 1976).
ing into neural computations. We proceed in two steps. FCG is More specifically, it turns out to be non-trivial to represent
translated from symbolic processing to numeric processing us-
ing a vector symbolic architecture, and this numeric processing a group of properties of an object (a feature structure), to
is then translated into neural network computation. Our exper- compare feature-structures to each other, and to handle vari-
iments are still in an early stage but already show promise. able binding and feature structure merging - all operations
Keywords: Vector Symbolic Architectures; Fluid Construc- which many researchers have argued to be necessary for intel-
tion Grammar; Connectionist Symbol Processing
ligence. We believe the symbolic and the numeric approach
Introduction can only be reconciled when they are viewed as two levels of
description of the same system whereby the former describes
Since the early days of cognitive science in the late nineteen and models natural objects at a higher level than the latter.
fifties, there has been a struggle to reconcile two approaches Each level has its own abstractions at which regularities are
to model intelligence and cognition: a symbolic and a nu- revealed and each own laws of operation. It is necessary and
meric one. The symbolic approach postulates an abstract highly interesting to find out how the different levels map to
layer with symbols, symbolic structures, and operations over each other. This paper sets some small steps in this direc-
these symbolic structures, so that it is straightforward to im- tion. We do not go immediately from the symbol level to
plement the kind of analysis that logicians, linguists, and psy- the numeric level but rather use a two-step process: mapping
chologists tend to make. AI researchers have built remarkable the symbolic level to a symbolic vector layer, as suggested
technology to support such implementations based on high by several researchers (Hinton, 1990; Neumann, 2001; Plate,
level ‘symbolic’ languages like LISP. 1994; Gayler, Levy, & Bod, 2010) and then mapping this
The numeric approach wants to look at cognitive process- layer to a possible neural implementation level in terms of
ing in terms of numeric operations. It is motivated by the fact populations of neurons, which has also been explored already
that biological neuronal networks are dynamical systems and in (Eliasmith, 2013).
that numeric processing can model self-organizing processes.
So the numeric approach tries to get intelligent behavior with- This paper focuses only on the first step. Experiments have
out needing to postulate symbolic structures and operations also been done for the second step using the Nengo frame-
explicitly. There have been several waves exploiting this nu- work (Eliasmith, 2013) but are not reported here. The pa-
meric approach under the head of neural networks and most per begins by introducing Fluid Construction Grammar as a
recently deep learning. challenging test case for studying how to map symbolic pro-
The symbolic approach has proven its worth in model- cessing to numeric processing. It then proceeds to describe
ing very large scale language systems, search engines, prob- a potential approach for the translation of FCG to vector
lem solvers, models of expert knowledge, ontological and form, namely Holographic Reduced Representations (HRR).
episodic memory, etc., but most of these applications rely Finally, it presents the results of experiments using HRR to
heavily on a human analyst who identifies the relevant sym- produce a vector representation of FCG feature structures and
bols and symbol processing operations. It is usually claimed core operators.
560
FCG and its key operations then this structure is transformed until enough information is
present to render a concrete utterance. There are often multi-
Fluid Construction Grammar is a computational platform for
ple ways to expand a transient structure so a search space is
implementing construction grammars (Steels, 2011). It is
unavoidable.
a typical example of a complex symbol system addressing
Constructions are also represented as feature structures
a core competence of the human brain, namely the repre-
and they are more abstract than transient structures. They typ-
sentation and processing (comprehension, production, learn-
ically contain variables that can be bound to the elements of
ing) of language. FCG was originally designed for modeling
transient structures and they contain less information about
language learning and language change (Steels, 2012), and
some of the units. Constructions have a conditional part
language-based robot interaction (Steels & Hild, 2012). More
which has to match with the transient structure they try to ex-
recently research has focused on challenging problems in lin-
pand and a contributing part which they add to the transient
guistics and broader coverage grammars. The components of
structure if the conditional part matches. The conditional part
FCG are symbols, feature structures, transient structures and
is decomposed into a production lock which constrains the
constructions.
activation of a construction in production and a comprehen-
Symbols are the elementary units of information. They sion lock which constrains the construction in comprehen-
stand in for syntactic categories (like ‘noun’ or ‘plural’), sion. When the lock fits with the transient structure, all infor-
semantic categories (like ‘animate’ or ‘future’), unit-names mation from the construction which is not there yet is merged
(e.g. ‘noun-phrase-17’), grammatical functions (like ‘sub- into the transient structure. So match and merge are the most
ject’ or ‘head’), ordering relations of words and phrases (e.g. basic fundamental operations of the grammar.
‘meets’ or ‘preceeds’), meaning-predicates, etc. A basic
Here is a simplified example of the double object construc-
grammar of a human language like English would certainly
tion (Goldberg, 1995) handling phrases like “she gave him a
feature thousands of such symbols, and the set of meaning-
book”. It has a unit for the clause as a whole (?ditransitive-
predicates is basically open-ended. Symbols can be bound to
clause) and for the different constituents (?NP-1, ?verb, ?NP-
variables, which are written as names with a question-mark
2 and ?NP-3). The conditional part is on the right-hand side
in front as: ?unit, ?gender, ?subject, etc. Symbol names are
of the arrow and the contributing part on the left-hand side.
chosen to make sense for us but of course the FCG interpreter
Units in the conditional part have a comprehension lock (on
has no clue what they mean. The meaning of a symbol only
top) and a production lock (below). The ≤ sign between units
comes from its functions in the rest of the system.
means ‘immediately preceeds’.
Feature structures are a way to group information about
a particular linguistic unit, for example, a word or a phrase.
A feature structure has a name to index it (which is again a ?verb
?ditransive-clause
symbol, possibly a variable) and a set of features and val- sem-valence:
constituents:
{receiver(?receiver)} ←
{?NP-1, ?verb,
ues. Construction grammars group all features of a unit to- syn-valence:
gether, whatever the level. So a feature structure has phonetic ?NP-2, ?NP-3} {ind-obj(?NP-2)}
and phonologic features, morphological information, syntac-
tic and semantic categories, pragmatic information, as well
?ditransive-clause
?NP-1
as structural information about the many possible relations # predicates:
sem-function: referring
between units (constituent structure, functional structure, ar- {cause-receive(?event),
referent: ?causer
causer(?event,?causer), ≤
gument structure, information structure, temporal structure, sem-cat: {animate}
transferred(?event,?transferred),
phrasal-cat: NP
etc.). All of these are represented explicitly using features
receiver(?event ?receiver)}
case: nominative
and values. The values of a feature can be elementary sym- 0/
bols, sets of symbols (e.g. the constituents of a phrase form
?verb
a set), sequences or feature structures, thus allowing a hierar-
chically structured feature structure. referent: ?event ?NP-2
sem-function: predicating
Feature structures are used to represent transient struc- sem-function: referring
sem-valence:
sem-cat: {animate}
tures. These are the structures built up during comprehen- {actor(?causer), ≤ ≤
undergoer(?transferred)} referent: ?receiver
sion and production. The features are grouped into a semantic
syn-valence:
phrasal-cat: NP
pole, which contains the more semantic oriented features, in- case: not-nominative
{subj(?NP-1),
cluding pragmatics and semantic categorisations, and a syn- dir-obj(?NP-3)}
tactic pole, which contains the form-oriented features. For
?NP-3
comprehension, the initial transient structure contains all the
information that could be gleaned from the form of the ut- sem-function: referring
sem-cat: {physobj}
terance by perceptual processes and then this transient struc-
referent: ?transferred
ture is progressively expanded until it contains enough infor-
phrasal-cat: NP
mation to interpret the utterance. For production, the initial case: not-nominative
transient structure contains the meaning to be expressed and
561
A regular speaker of a language knows probably something bolic to vector representations (Eliasmith, 2013). Since vec-
on the order of half a million constructions. So it is not possi- tors can be used by many machine learning methods (for ex-
ble to simply throw them in one bag and try constructions ran- ample, neural networks, support-vector machines, etc.), once
domly. FCG therefore features various mechanisms to fine- a symbol system has been translated to a vector space archi-
tune which construction should be selected and, if more than tecture, a subsequent implementation of such a system in nu-
one construction matches, which one should be pursued fur- meric terms should give us access to the machine learning
ther. They include priming networks, organisation of con- methods associated with distributed representations.
structions into sets, partial orderings, a scoring mechanism, Given the claims made by these various authors, we de-
footprints preventing some constructions from becoming ac- cided to explore VSA, more specifically Holographic Re-
tive, etc. duced Representations (HRR), for implementing Fluid Con-
Obviously Fluid Construction Grammar is a sophisticated struction Grammar, and then further translate this representa-
computational formalism but all the mechanisms it proposes tion using existing neural mappings (Eliasmith, 2013). The
are absolutely necessary to achieve accurate (as opposed to remainder of this section reflects on what is required for the
approximate) comprehension and correct production of utter- mapping from FCG to VSA.
ances given a particular meaning. Due to the space limita-
tions, the reader is referred to the rapidly growing literature Representing FCG entities
on FCG for more details (see also emergent-languages.org). Symbols A symbol in FCG can be mapped to a randomly
generated n-dimensional vector. All the elements of the vec-
Holographic Reduced Representations tors are drawn from a normal distribution N (0,1/n) following
(Plate, 1994). The symbol and its symbol vector are stored in
The kind of structures used in FCG can be represented us- an error-correction memory as explained later.
ing the AI techniques provided by symbolic programming
Feature-value pairs A feature-value pair (the primary
languages such as LISP. It is very non-trivial to implement
component of a feature structure) can be mapped to a circular
FCG but doable and adequate implementations exist. We now
convolution of two vectors, the feature vector and the value
come to the key question: Can similar mechanisms also be
vector. Following (Plate, 1994), we define convolution as
implemented using a numeric approach? This means that the
follows:
basic elements of FCG are encoded in a numeric format and
the basic FCG-operations are translated into numeric opera-
tions over them. Various efforts to implement symbolic sys- Z = X ⊗Y (1)
tems in neural terms have already been undertaken (Shastri n−1
& Ajjanagadde, 1993). The key problem however is scaling: z j = ∑ xk y j−k f or j = 0, ....., n − 1 (2)
The number of neurons required to represent the millions of k=0
symbols in human-scale grammars so far becomes biologi- Once we have a combined feature-value vector, we can, given
cally totally unrealistic (Eliasmith, 2013). the feature, extract the value, using circular correlation (Plate,
Vector-based approaches known as Vector Symbolic Ar- 1994; Neumann, 2001), which convolves the pair with the
chitectures (VSA) have demonstrated promising results for approximate inverse of the feature vector (Plate, 1994):
representing and manipulating symbolic structures using dis-
tributed representations. Smolensky’s tensor product is one X = Z ⊕Y (3)
of the simplest variants of VSA (Smolensky, 1990). How-
ever, the main problem with this approach is that a ten- n−1
sor binding results in an n2 -dimensional vector and in case x j = ∑ zk y j+k f or j = 0, ....., n − 1 (4)
k=0
of recursive representations does not scale well (Eliasmith,
2013). Alternative approaches have been suggested, such Feature-set A feature-set consists of a set of feature-value
as Binary Spatter Codes (Kanerva, 1997) and Holographic pairs. This can be mapped to HRR using vector addition
Reduced Representations (HRR) (Plate, 1994), where bind- (Plate, 1994):
ing is done with circular convolution, which results in a n- Z = X ⊗Y + T ⊗ S (5)
dimensional vector, so that the number of dimensions does
not increase. Hinton explored distributed representations Feature structures Feature structures in FCG consist of
based on the idea of reduced representations (Hinton, 1990) feature-sets combined into units. Each unit has a unique name
and later (Neumann, 2001) demonstrated that connectionist which is stored as a symbol in the symbol memory. A feature
representational schemes based on the concept of reduced structure is constructed in the same way as a feature-set, i.e by
representation and on the functional composition of hierarchi- convolution and addition, except that to also include units, we
cal structures can support structure-sensitive processes which now convolve unit-feature-value triples rather than feature-
show a degree of systematicity. VSAs provide a means for value pairs. The feature structure is the addition of all triples:
representing structured knowledge using distributed vector
representations and as such provide a way to translate sym- Z = U ⊗ X ⊗Y +U ⊗ T ⊗ S (6)
562
Since we can represent feature structures, we can also rep- Matching In general, matching two feature structures can
resent transient structures as well as constructions of arbitrary be done by the same principle that is used in the error-
length. correcting memory, i.e. similarity (Equation 7). Since we use
Parts of the structure (also called a trace) can be retrieved the dot product as our similarity measure, we have a compu-
using the correlation operation (Equation 3). For example, tationally fast operation, which is well understood mathemat-
given U ⊗ X and the whole structure, we can obtain Y. ically. Using dot product provides us with a way to compare a
However, correlation on traces is noisy. A trace preserves feature structure to every structure in a pool of structures and
just enough information to recognise the result but not to re- to find the structure with the highest similarity as the closest
construct it. Therefore, we need an error-correction memory match. However this ignores two problems we have not tack-
that stores vectors for possible units, features and values. The led yet: (i) Match in FCG is an includes rather than similar-
memory is used to compare the noisy output of the correla- ity operation: if the source structure is a subset of the target
tion operation with all vectors known to the system. Various structure, they should still match, even if the similarity be-
comparison measures can be used, however, the most stan- tween the two structures is low. In fact, this is very common
dard one is dot product, which for two normalized vectors is because the lock of a construction is always a subset of the
equal to the cosine of their angle Neumann (2001). We define transient structure, and (ii) This does not take variable bind-
the following similarity for two vectors: ings yet into account.
Merging Merging two feature structures is straightforward
X Y because their respective vector representations can simply be
sim(X,Y ) = (7)
||X||||Y || added. It is possible to deal with variable bindings by first
This similarity is used to retrieve the vector stored in the transforming both feature structures by replacing the vari-
error-correction memory with the highest similarity to the ables by their bindings as discussed earlier. However, there
output of correlation. That vector represents the most plau- are also some tricky issues to be resolved (e.g. variables may
sible value of the feature in a particular trace. be bound to other variables making a variable-chain and then
one of these has to be substituted for all the others).
Matching and Merging
The promise of distributed representations is that they can do Preliminary implementation experiments
very fast operations over complete feature structures (such as We now report on first steps in implementing the FCG→VSA
comparing them) without traversing the components as would mapping described above. Experiments were carried out in
be done in a symbolic implementation. Let us see how far we Python using the mathematical extension numpy.
might be able to get without decomposition. FCG basically Feature encoding and retrieval First, we tested the preci-
needs (i) the ability to copy a feature structure, (ii) to compare sion of value retrieval from a feature set and a feature struc-
two feature structures (matching) and (iii) to merge a feature ture. We were particularly interested in the relationship be-
structure with another one. tween HRR vector dimensionality, length of FCG feature
Copying It is not difficult to copy two feature structures structure and retrieval accuracy. We therefore tested differ-
because it means to copy the two vectors. However we of- ent lengths of FCG feature sets/structures (5, 50, 500) vs di-
ten need to replace all variables in a feature structure either mensionality of HRR vectors (10, 100, 1,000 etc). We did
by new variables (e.g. when copying a construction before 100 runs for each combination (results averaged). Each time
applying it) or by their bindings (e.g. when creating the ex- HRR vectors for individual features were random-initialized
panded transient structure after matching). It has been sug- and combined into a feature-structure representing HRR vec-
gested that copy-with-variation can be done by convolving tors using convolution and addition. Then we attempted to
the current structure A with a transformation vector T (Plate, retrieve all feature values and measured the number of cor-
1994): rect retrievals divided by the original FCG feature sets length.
A⊗T = B (8) Figure 1 (top) illustrates how precision score increases with
The transformation vector is first constructed by convolving vectors of higher dimensionality, consistent with previous ex-
the new values with the inverse of the current values, then periments with HRR (Neumann, 2001). To encode FCG fea-
adding up the pairs by vector addition. For example, in order ture sets with an average length of about 50-100 features, we
to set the value of the lex-cat feature with the current value required around 3,000-dimensional HRR vectors. This figure
?x to the new value which is the binding of ?x, e.g. noun, also illustrates how differences in HRR vector dimensionality
the inverse of ?x should be convolved with noun. The full are related to the cardinality of the feature-set. For example,
transformation vector is in order to represent and successfully retrieve all values from
a 5-pair set, around 300 dimensions appears to be sufficient,
x 0 ⊗ y + z0 ⊗ w (9) while a 500-pair feature-set requires just over 30,000. Our
feature-values pairs behave in accordance with (Plate, 1994),
Such vectors can be hand-constructed (Plate, 1994) – which which can described as follows:
is not desirable – or learnt from examples as shown in m
(Neumann, 2002). n = 3.16(k − 0.25)ln 3 (10)
q
563
where n is a lower bound for the dimensionality of the vectors ious conditions working with feature sets (rather than feature
in order for retrieval to have a probability of error q; k is the structures) for simplicity.
number of pairs in a trace and m is the number of vectors First, we investigated how similarity (Equation 7) between
in the error-correction memory. For example, to have a q of two HRR vectors responds to structural changes vs changes
10−1 in a 5-pair trace with 1,500 items in the error-correction in underlying feature values. Figure 2 (top, bottom) shows
memory, n should be approximately 213. For a smaller q that changes to feature values result in a greater decrease in
of 10−2 , around 300 dimensions is required. This roughly similarity (reaching 0.0 after 102 for a 100-pair structure)
follows the n and q observed and illustrated in Figure 1 (top). than structural changes i.e. adding new pairs, which led to
Feature structures (triples) behave similarly to pairs al- a more gradual similarity degradation (reaching 0.0 after 105
though dimensions required to encode triples increase. for the same structure size). The difference between these
Figure 1 (bottom) illustrates how both feature sets two types of changes is important in FCG, where structures
and feature structures scale for various structure sizes can be structurally different and still match, while structures
(5;10;50;100;500;1,000 pairs/triples). These results can be that for example, differ in feature values, should not. This
directly translated to FCG. A toy grammar starts at 1-5 units finding is also in line with previous experiments comparing
per construction with 1-10 feature-value pairs in each. A HRR structures using dot product (Plate, 1994), where simi-
more complex grammar can have around 10 units with ap- larity was more sensitive to the content of feature-value pairs
proximately the same number of pairs in each unit. Repre- rather than the quanitiy of pairs.
sented as triples, such structures can be encoded in vectors
of around 6,000 dimensions. Really large grammars of 30
units and 30 feature-values pairs in each unit require roughly
100,000 dimensions.
Figure 2: Top: Comparison of similarity values for changes in
structure vs changes in bindings for a structure of 10,000 pairs.
Bottom: Comparison of similarity as structures of different original
length are extended.
Figure 1: Top: The effects of dimensionality on precision scores When adding or removing new pairs, similarity (Equation
in feature sets and structures of various length. Bottom: Scaling of
sets and structures (vector dimensionality at which precision scores 7) is affected as illustrated in Figure 3. Initially, both struc-
become 1.0). tures contained 1,000 pairs; the second structure was subse-
quently changed by an order of magnitude at a time. Thus the
Matching We numerically investigated if HRR representa- first structure gradually became a subset of the second. The
tions can be used to implement the FCG match operation in graph illustrates that as the structure becomes extended with
two phases: We investigated changes in sim(X,Y ) under var- new pairs, similarity of the two structures begins to drop, de-
564
spite the fact that the structures share the initial 1,000 pairs. Acknowledgments
However, this degradation is very gradual, and similarities Research reported in this paper was funded by the Marie
reach 0.0 only after 105 new pairs have been added. Further- Curie ESSENCE ITN and carried out at the AI lab, Vrije Uni-
more, it can be seen that for larger structures such degradation versiteit Brussel and the Institut de Biologia Evolutiva (UPF-
is more gradual than for smaller ones (see Figure 2 bottom). CSIC), Barcelona, financed by the FET OPEN Insight Project
For example, for 10-pair structures similarity is almost 0.0 and the Marie Curie Integration Grant EVOLAN. We are in-
after 103 new pairs have been added. This is still, however, debted to comments from Johan Loeckx and Emilia Garcia
fairly gradual, considering that a 10-pair structure had 1,000 Casademont.
pairs added to it before becoming dissimilar to the original.
The drop in sim(X,Y ) appears to be asymmetrical: removing References
pairs gives lower similarity than adding the same number of Eliasmith, C. (2013). How to build a brain: A neural ar-
pairs. This is expected since removing pairs results in less chitecture for biological cognition. New York, NY: Oxford
shared information between the structures than adding pairs. University Press.
These findings are good and bad news at the same time. Gayler, R. W., Levy, S. D., & Bod, R. (2010). Explanatory
On the one hand it is good that feature value changes have aspirations and the scandal of cognitive neuroscience. In
a more drastic effect on sim(X,Y ) than structure changes. Proceedings of the first annual meeting of the bica society
On the other hand, the system will have to be able to au- (pp. 42–51). Amsterdam: IOS Press.
tonomously find out whether two HRR vectors represent fea- Goldberg, A. (1995). Constructions: A construction gram-
ture sets which differ structurally or in terms of feature values. mar approach to argument structure. Chicago: University
Possibly this distinction can be learnt. But finding a solution of Chicago Press.
that is invariant to HRR vector dimension size and feature set Hinton, G. (1990). Mapping part-whole hierarchies into con-
cardinality is likely not an easy task. Another problem is the nectionist networks. Artificial Intelligence, 46, 47–75.
commutative nature of sim(X,Y ) which essentially does not Kanerva, P. (1997). Fully distributed representation. In Proc.
allow to determine which is the more inclusive feature set. real world computing symposium (pp. 358–365). Tsukuba-
city, Japan: Real World Computing Partnership.
Neumann, J. (2001). Holistic processing of hierarchical
structures in connectionist networks. Doctoral dissertation,
School of Informatics,The University of Edinburgh UK.
Neumann, J. (2002). Learning the systematic transformation
of holographic reduced representations. Cognitive Systems
Research, 3, 227–235.
Newell, A., & Simon, H. A. (1976). Computer science as
empirical inquiry: Symbols and search. Communications
of the ACM, 19, 113–126.
Ohlsson, S., & Langley, P. (1985). Identifying solution paths
in cognitive diagnosis (Tech. Rep. No. CMU-RI-TR-85-2).
Pittsburgh, PA: Carnegie Mellon University, The Robotics
Institute.
Plate, T. A. (1994). Distributed representations and nested
Figure 3: Gradual changes to a feature set of 1,000 feature-
compositional structure. Doctoral dissertation, Graduate
value pairs and their effects on similarity values.
Department of Computer Science, University of Toronto,.
Shastri, L., & Ajjanagadde, V. (1993). From simple associa-
Conclusions tions to systematic reasoning: A connectionist representa-
tion of rules, variables, and dynamic bindings. Behavioral
This paper has speculated how a linguistically and computa-
and Brain Sciences, 16, 417?494.
tionally adequate formalism for language, namely Fluid Con-
Smolensky, P. (1990). Tensor product variable binding and
struction Grammar, could be represented in a Vector Sym-
the representation of symbolic structures in connectionist
bolic Architecture, more specifically Holographic Reduced
systems. Artificial Intelligence, 46, 159–216.
Representations, as a step towards a neural implementation.
Steels, L. (Ed.). (2011). Design patterns in Fluid Construc-
We proposed a number of steps and reported some prelim-
tion Grammar. Amsterdam: John Benjamins.
inary implementation experiments with promising results.
Steels, L. (Ed.). (2012). Experiments in cultural language
The main conclusion so far is that a number of fundamen-
evolution. Amsterdam: John Benjamins.
tal issues remain to be solved to make the FCG→VGA map-
Steels, L., & Hild, M. (Eds.). (2012). Language grounding in
ping fully operational, particularly the issue of implementing
robots. New York: Springer-Verlag.
a matching operation that uses a binding list and possibly ex-
tends it while matching takes place.
565