=Paper= {{Paper |id=Vol-1419/paper0039 |storemode=property |title=Usage-based Grammar Learning as Insight Problem Solving |pdfUrl=https://ceur-ws.org/Vol-1419/paper0039.pdf |volume=Vol-1419 |dblpUrl=https://dblp.org/rec/conf/eapcogsci/CasademontS15 }} ==Usage-based Grammar Learning as Insight Problem Solving== https://ceur-ws.org/Vol-1419/paper0039.pdf
                  Usage-based Grammar Learning as Insight Problem Solving
                                                     Emilia Garcia-Casademont2
                                                            Luc Steels1,2
                                         1 ICREA 2 Institut de Biologia Evolutiva (UPF-CSIC)

                                                Dr. Aiguader 88, Barcelona 08003, Spain


                             Abstract                                       success of construction grammar in empirical linguistics,
                                                                            studies of child language, historical linguistics and other ar-
  We report on computational experiments in which a learning
  agent incrementally acquires grammar from a tutoring agent                eas of language studies - even though there is some excep-
  through situated embodied interactions. The learner is able               tional work, such as by Nancy Chang (Chang, 2008). This
  to detect impasses in routine language processing, such as                computational gap is undoubtly due to the fact that the field
  missing a grammatical construction to integrate a word in the
  rest of the sentence structure, to move to a meta-level to repair         of computational construction grammar is still in its infancy
  these impasses, primarily based on semantics, and to then                 and not enough computational research has been done so far
  expand or restructure his grammar using insights gained from              on possible learning strategies.
  repairs. The paper proposes a cognitive architecture able to
  support this kind of insight learning and tests it on a grammar              Nevertheless computer simulations are a valuable comple-
  learning task.                                                            ment to the empirical work on language learning. They would
                                                                            help us a great deal to get a much clearer idea of what con-
  Keywords: Usage-based language learning; insight problem                  structions look like from a computational point of view and
  solving; Fluid Construction Grammar.                                      how they are acquired. Computer simulations allow us to test
                                                                            the role of different cognitive mechanisms for grammar learn-
     What is usage-based language learning                                  ing by performing experiments with and without these mech-
Many researchers have argued for a usage-based approach                     anisms and by varying the learning challenges and the nature
(Langacker, 2000) to language learning based on empirical                   and amount of the input the learner gets.
data from child language acquisition (Tomasello, 1992). This
approach emphasizes that learning is essentially data-driven                               Insight Problem Solving
and gradual (Bybee, 2006). It takes place in embodied situ-                 Today, most machine-learning of language uses a form
ated interactions in which not only the utterance and its syn-              of Bayesian unsupervised grammar learning operating over
tactic properties (e.gṫhe ordering of the words) but also the              large amounts of data (Bod & Scha, 2003). Because these
underlying meanings and communicative context are avail-                    models are data-driven, they subscribe to one of the key tenets
able to the learner.                                                        of a usage-based approach to grammar learning and they have
   The usage-based approach goes hand in hand with a con-                   therefore already been used for simulating child grammar
structional perspective on grammar (Goldberg, 2006), which                  acquisition (Bannard, Lieven, & Tomasello, 2009). But al-
emphasizes that language structure is motivated by usage,                   though this approach is obviously relevant, we explore here a
instead of innate, arbitrary, formal principles of universal                complementary learning method which views grammar learn-
grammar. Grammar learners have to discover what role the                    ing as a form of insight problem solving.
grammatical markers and syntactic patterns play in express-                    Insight problem solving is familiar from the psychological
ing meaning and achieving communicative functions, and                      literature on problem solving (Ohlsson, 1984). It makes a dis-
how syntax helps to dampen combinatorial explosions and                     tinction between two modes: routine and meta level problem
avoid ambiguity. Construction grammar therefore argues that                 solving. For routine problem solving, the problem solver uses
the fundamental unit of grammar is the construction, a sign                 a set of micro-operations that either provide a direct solution
that relates meaning or function with form with the inter-                  or are a step towards a solution. The main challenge for the
mediary of syntactic and semantic categorizations. A con-                   problem solver is to find a path between the initial problem
struction is very rich, packaging constraints at different lay-             state and the goal in a search space of possible hypotheses.
ers of language (from phonetics and phonology to morphol-                      Meta-level problem solving is necessary when the prob-
ogy, syntax, semantics and pragmatics) and from many dif-                   lem solver reaches an impasse. The given statement of the
ferent perspectives (phrase structure, functional structure, ar-            problem (as understood by the problem solver), the represen-
gument structure, temporal structure, information structure,                tations being used, and the known operators do not allow a
etc.). Constructions are acquired in a gradual data-driven way              straightforward solution. Meta-level operators may then un-
with learners creating progressively a huge inventory of more               block the impasse. For example, they may reinterpret the
sophisticated constructions linked in networks (Tomasello,                  problem statement by relaxing some of its constraints (as
1992).                                                                      needed for solving the 9-dot problem), they may change inter-
   Computational simulations of usage-based constructional                  nal representations inferred from the problem (as in the An-
learning are rare, despite widespread support for a usage-                  thony and Cleopatra problem (Patrick & Ahmed, 2014)), or
based approach in cognitive science circles and a growing                   possibly introduce new operators. One very common meta-


                                                                      258
operator, studied in particular by Koehler in his classic insight                   But that is not the end of the story. After solving a problem
learning with chimpanzees, is to coerce objects to have func-                    using meta-operators, a learner should then try to expand his
tions that they normally do not have (Koehler, 1956). For                        available repertoire of routine operators (i.e. constructions)
example, to view a shoe as a hammer so that it can play the                      to capture the insight gained from dealing with the impasse
role of the instrument in a hitting action.                                      to ensure that in the future the impasse does not occur any-
   In the case of language, the micro-operations for routine                     more. When that is the case, we talk about insight learning
problem solving constitute the application of constructions to                   (Koehler, 1956). It is based on a set of learning-operators that
expand a transient structure (which captures a particular hy-                    make small changes to the grammar, for example, add a new
pothesis for comprehending or producing an utterance) to a                       lexical category to the lexical construction of a word or make
more extended transient structure. For example, a construc-                      a construction more general by relaxing some constraints on
tion might combine an article and a noun into a noun phrase.                     one of its slots.
Although some exploration and backtracking may be nec-
essary to know which construction needs to be applied, the
problem solver is in principle able to solve the problem, i.e. to
reconstruct the meaning of an utterance as listener or to pro-
duce an utterance expressing the target meaning as speaker.
   An impasse here means, for example, that the hearer en-
counters an unknown word, a particular word does not fit
within the syntactic pattern implied by its immediate context,
an ordering constraint is violated, no structure can be found
that integrates all words in the utterance, or there is syntactic
and semantic ambiguity implying that some grammatical sig-
nals preventing ambiguity have not been picked up properly.
These impasses are frequent when learning a new language.
The speaker may also reach an impasse because, for example,                      Figure 2: Example of interaction between two agents, one acting as
he may lack a word to express a particular meaning, a word                       tutor and the other as learner. A human experimenter creates scenes
he would like to use does not fit with a construction already                    about which the agents can converse. The robots are equiped with
chosen, the available constructions may leave significant am-                    sophisticated components for perception, action, world modeling,
                                                                                 and script following in order to play a language game. Grammar
biguity, etc.                                                                    learning and tutoring takes place within the context of such games.
   Meta problem solving for language involves a set of meta-
operators and a cognitive architecture as in Figure 1. For ex-
                                                                                    The rest of the paper reports on experiments attempting to
ample, the listener could try to handle a sentence which he
                                                                                 computationally implement insight language learning. Our
cannot parse by ignoring possible errors in it (e.g. the wrong
                                                                                 goal is not to handle the full complexity of human language,
word order). Or he may handle the presence of an unknown
                                                                                 that would not be possible at this point and the results would
word by inferring from the syntactic context, the situation
                                                                                 be very hard to analyse, but rather, to create a minimal
model and the ontology, what a possible meaning might be.
                                                                                 model in order to test the feasibility of the approach and
The speaker could coerce a word to have an additional lexi-
                                                                                 examine the first concrete examples of meta-operators and
cal category so that it fits, as in “He googled him” where the
                                                                                 learning-operators. We therefore use a minimal representa-
noun ”google” has been coerced into a verb.
                                                                                 tion of meaning and focus on a minimal form of syntactic
                                                                                 structure, namely phrase structure.
                      Meta-operator
                                                              Meta-level            The experiments are intended for the MYON robotic plat-
                                                              Problem            form shown in Figure 2 (Steels & Hild, 2012). There are
                                                              Solving
               Diagnostic       Repair                                           two humanoid robots, one acting as tutor and the other one
                                                                                 as learner. The situation is manipulated by an experimenter
Transient         Transient     Transient         Transient                      which puts objects on the table (e.gȧ small piece of paper, a
structure         structure     structure         structure
State 1           State 2       State 3           State 4     Routine            wooden block, etc.) and performs actions with them. The
                                                              Problem            robots are able to perceive the situation before them, segment
                                                              Solving
            Construction                    Construction                         the objects and detect relations, such as spatial relations or
            application                     application                          movements. They can also point to objects and gesture suc-
                                                                                 cess or failure in the language game. The relations are com-
Figure 1: Cognitive architecture supporting insight language learn-              positional (as in ”a black round wooden block”) with unlim-
ing. There is a layer of routine problem solving, diagnostics con-               ited recursion (as in ”a small paper which is on a small paper
tinuously monitoring activity at this layer in order to discover an              which is on the table”). The situation model can also include
impasse, and meta-level operators try to repair the impasse, for ex-
ample by coercing a word to fit with an available grammatical con-               mental states (e.g. ”Jorge wants the wooden block (to be) on
struction.                                                                       the table”).


                                                                           259
  The next section first describes routine language process-                (away-arg-1 ?r-2 ?o-3) ; the object that is moving
ing. Then we turn to meta-level problem solving, dis-                       (away-arg-2 ?r-2 ?o-1) ; away from
cussing syntactic and semantic expansions, and to learning                  (size small ?o-3) ; the moving object is small
operators. The paper concludes with some experimental re-                   (material paper ?o-3) ; and its material is paper
sults. There is an on-line web demo and additional material                 (physobj table ?o-1) ; the movement is away from a table
available at www.biologiaevolutiva.org/lsteels/demos/EAP-                   (material wood ?o-1) ; which is made of wood
garcia-steels-2015/
                                                                            The different linkages between the predications through
           ’Routine’ Language Processing                                 their arguments equalities may be represented graphically as
Meaning representation                                                   a semantic network (See Figure 3). One of the objects in this
                                                                         network is the topic of the utterance, for example, the event
The complex visual processing and conceptualization pro-                 itself (i.e. ?r-2), or the object which is moving (i.e. ?o-3)
cesses required for these experiments have been described at             as in the utterance “the small paper moving-away from the
length in (Spranger, Loetzsch, & Steels, 2012). The situa-               wooden table”.
tion models of speaker and hearer are not identical because
their perception is different, but in general they are suffi-
ciently shared that communication is possible, otherwise the
language game fails. For the purposes of the experiments re-
ported here, we have defined a ’situation model generator’
that generates possible situation models in the same format
as obtained by visual processing.
    For the representation of the situation model we use well-
known standard techniques from (typed second order) logic.
The situation model is internally represented in terms of n-
ary predicates, which have objects perceived in the world as
arguments. The objects are labeled obj-1, obj-2, etc. The ob-
jects can be bound to variables which are written down with
a question mark in front, as in ?obj-1, ?obj-2, etc. Each pred-
icate consists of an attribute, that acts also as a type, and a          Figure 3: Semantic network representing the meaning of an ut-
value.                                                                   terance. Each node is a predication and the links represent co-
    Unary predicates are written down as:                                referential relations between the arguments.
    (attribute value object-to-which-predicate-applies)
as in                                                                       To simplify the experiments, we use utterances with only
    (color red obj-1) or (material plastic ?obj-6)                       content-carrying lexical items and only word order and phrase
In the first example, the color red is predicated about obj-1,           structure as the means for expressing syntactic structure, ig-
i.e. obj-1 is red in the current situation model. In the second          noring other syntactic devices like morphology, grammatical
example, the material property plastic is predicated about a             function words, agreement, etc. Thus, the utterance “a small
variable object ?obj-6. It will be true if there is an object in         paper moves away from a wooden table” would be rendered
the world which has the material property plastic.                       as “small paper moves-away-from wooden table”. There is
    N-ary predicates are decomposed into a number of separate            of course no necessity to use English-like words, that is only
predicates. One for the predicate itself and then predicates             done to make the utterances understandable for the reader,
for each of the arguments. For example, suppose there is a               and there is no reason why the grammar will be English-like,
predicate of type moving (which is for all types of movement)            except that English also makes heavy use of phrase structure.
with a possible value away, then there are two predicates for
its arguments, as illustrated in the following example.                  Grammar representation
   (moving away ?r-2) ; the main predicate                               We use Fluid Construction Grammar (FCG) for the represen-
   (away-arg-1 ?r-2 ?o-3) ; the object that is moving                    tation of the grammar (Steels, 2011). FCG views language
  (away-arg-2 ?r-2 ?o-1) ; the object being moved away from.             processing in terms of operations over transient structures. A
?r-2 gets bound to the event of moving, ?o-3 to the object that          transient structure captures all that is known about a particu-
is moving and ?o-1 to the object ?o-3 moves away from.                   lar utterance being parsed or produced. In routine language
    Different predications can be combined into conjunctions             processing, transient structures are expanded by the applica-
and they are linked together by reusing the same variable-               tion of constructions in a process of matching (to see whether
names or object-names. For example, the utterance ”a                     the construction can apply) and merging (to add information
small paper moves away from a wooden table” would be                     from the construction to the transient structure).
represented as                                                              FCG represents transient structures in terms of feature
    (moving away ?r-2) ; the main predicate                              structures, similar to many other computational formalisms


                                                                   260
in use today, such as HPSG. A feature structure consists of a             a simplified example of a grammatical construction:
set of units which correspond to words and phrases, and fea-
                                                                                        ?np-unit
                                                                                                                        
tures associated with these units. The features can be at any                         constituents:                     
level of linguistic description. For the present experiments,                         {?word-unit-1, ?word-unit-2} 
                                                                                                                        
features of a unit include: meaning (the set of predications),                        sem-cat: material                 ←
                                                                                                                        
                                                                                      syn-cat: noun-phrase
referent (the object referred to by the unit), form (the strings
                                                                                                                         
                                                                                        head: word-unit-2
                                                                                                                        
and ordering relations), args (the arguments of these predica-                          referent: ?obj
tions which can be used to combine this unit with the mean-                                                               
                                                                                   ?word-unit-1              ?word-unit-2
ing of other units), sem-cat (the semantic categorizations),                      sem-cat: size         sem-cat: material 
boundaries (the left and right boundaries), sem-subunits and                      referent: ?obj
                                                                                 
                                                                                                        ≤  referent: ?obj
                                                                                                                           
                                                                                                                             
syn-subunits (the constituents), potential-syn-cat (the poten-                    potential-syn-cat:   potential-syn-cat: 
tial lexical categories (parts of speech)), syn-cat (the actual                      {adjective}               {noun}
lexical category or the phrasal category), head (the subunit
acting as the head of the phrase), footprints (constructions
that changed this unit).                                                                   Meta-level processing
   A construction is an association between meaning and                   The consecutive application of constructions expands the
function (the semantic pole) and form constraints (the syntac-            transient structure to go from meaning to form or vice versa.
tic pole). Lexical constructions associate one or more pred-              But occasions will arise when no construction can be applied,
icates and a semantic categorization of the predicates (equal             particularly in language learning. The listener then moves to
to the attribute (i.e. type) of the predicate) in the semantic            the meta-level to repair the situation and then consolidate the
pole with the occurrence of a word-string and lexical cate-               outcome of the repair.
gories (i.e. parts of speech) of that word in the syntactic pole.
Grammatical constructions, in this case restricted to phrase
                                                                          Syntactic meta-operators
structure constructions, associate a pattern of units with se-            The listener can try to find a construction which is only par-
mantic constraints in the semantic pole with syntactic cate-              tially matching, and either coerce words to fit into that con-
gories (lexical or phrasal categories) and word orders in the             struction (syntactic coercion) or expand the applicability of
syntactic pole. Each construction has a score (between 0.0                the construction (extension). More specifically,
and 1.0) which reflects the success of that construction in past          + Coercion: A construction is found that is semantically
language interactions.                                                    compatible but one or more words do not have the appro-
   Constructions in FCG consist of two parts. A conditional               priate lexical category (as in the example of “googled” where
part (written on the right hand side) which specifies under               a noun occurs in a context where a verb is expected). The
which conditions a construction can become active and a con-              meta-operator then adds the lexical category to the word-unit
tributing part (written on the left hand side) which specifies            in the transient structure and the construction then applies.
what the construction contributes to the transient structure.             + Extension: A construction is found for which all the com-
Constructions must be usable both in comprehension and pro-               ponents match but there is another word ordering. In this
duction. So the conditional part is split into two ’locks’. A             case, the ordering constraint can be relaxed and the construc-
production lock (on top) which has to match with the transient            tion applied.
structure in production and a comprehension lock (below it)               + Reduction: All components of an existing construction
which has to match in comprehension. When a lock fits with                could be matching with the transient structure but there is a
a transient structure all the information of the construction             superfluous unit and no matching construction without this
(the other lock and the contributing part) gets merged in. The            unit. This superfluous unit can be ignored and the construc-
fact that constructions in FCG can be used both for compre-               tion as well as possible applied.
hension and production is crucial for insight learning because
                                                                          Semantic meta-operators
once the learner has been able to deduce the meaning (pos-
sibly indirectly), he can produce the same meaning himself                When no partially matching constructions could be found, it
with his own inventory and thus compare it to the utterance               may be possible to use the situation model and combine units
produced by the speaker.                                                  for words or phrases based on semantics, specifically:
   A lexical construction for the word “paper” is:                        + Build-or-extend-group: If two words or word groups ex-
                                                                        pressing unary predicates refer to the same object, they can
    ?word                   ?word
                                                                          be combined. For example, if there is a group for wooden
    sem-cat: material      # meaning:                
    potential-syn-cat:  ← 
                               {(material paper ?obj)}
                                                        
                                                                         table (based on an existing construction) and the utterance is
                                                                          small wooden table, the word-unit for small can be linked in
                        
      {adj, noun}             # form:                   
     referent: ?obj             {(string ?word paper)}                    with the group-unit for wooden table. The group-unit retains
                                                                          the same referent.
   “Paper” has two potential lexical categories: adjective (as               + Build-hierarchy When a relational word is encountered,
in “a paper basket”) and noun (as in “a small paper”). Here is            i.e. a word which introduces a predicate with more than


                                                                    261
one argument, such as moves-away-from, and no construc-                     a stream of utterances each describing a particular topic (ob-
tions are available to integrate it in the transient structure, the         ject or event) in a scene. Some example utterances are “”Paul
meta-operator looks in the world-model to detect which ob-                  sees (the) red carpet (that) Emilia wants”, or “Paul believes
ject plays which role and then combines the units for these                 (that) Emilia wants (the) carpet on (the) big wooden table.”.
objects into a new hierarchical unit. The meta-operator also                The learner is initialized with the same lexicon (because we
needs to decide which of the arguments is going to be the ref-              focus here on the acquisition of grammar). He is endowed
erent. In principle, it could be any argument (e.g. the event of            with the various operators described above, but without any
moving, the mover, or the object being moved away from). In                 grammatical constructions or syntactic categories (parts of
practice, the referent is determined by the semantic network                speech and phrases).
that is expressed. For example, in the sentence ”the ball on                   In a concrete interaction (following the classical Naming
the table”, the referent of the unit based on the relational word           Game), tutor and learner share the same situation model (rep-
”on” is the ball, whereas in ”He wants the ball (to be) on the              resented as a semantic network as in Figure 2). The tutor then
table” the referent is the on-relation itself.                              chooses a topic (one object to be referred to) and produces a
                                                                            description of this topic using his lexicon and grammar. Then
Consolidation                                                               the learner tries to parse the utterance and interpret the ex-
When an utterance could be successfully parsed after one                    tracted meaning against the situation model. The interaction
or more repairs, the learner activates consolidation operators              is successful when the learner was able to identify the topic.
that integrate the insights that were obtained into his construc-           This may involve both routine and meta-level processing fol-
tion inventory. In some cases this is straightforward, for ex-              lowed by learning. Each experiment is carried out for 5 dif-
ample, coercion can be consolidated by adding the relevant                  ferent tutor-learner pairs, using random choices from a set 20
lexical category to the potential lexical categories of a word.             situations, so that results can be compared.
In other cases, more complex manipulations are required. Ex-
isting constructions are first copied and then additions and                Experiment 1. Lexical categories given
changes made.                                                               The first experiment tests out the learning operators assuming
                                                                            that the learner already knows the lexical categories, i.e. parts
Alignment                                                                   of speech, of the words in the lexicon. Moreover grammatical
The meta-operators and learning-operators are hypotheses                    constructions are limited to those having only one semantic
made by the learner about the language of the speaker and                   category and one syntactic category per subunit. The task
mistakes are unavoidable. The learner therefore needs an ad-                is to learn the grammatical constructions. Figure 4 shows
ditional mechanism to progressively discard wrong hypothe-                  the results, measuring communicative success, inventory size
ses based on further input. We have achieved this by updating               (the number of constructions of the learner) and alignment
the score of constructions using the well known lateral inhi-               (how often the listener’s reproduction of the meaning of an
bition learning rule. Knowing which constructions ci need                   utterance is equal to the utterance of the speaker).
an increased score is easy: they are the constructions that
                                                                                                                 �������������������
were used on the path towards the final transient structure.                       ��                                                                                     ���

We use the following update rule: σci ← σci (1 − γ) + γ, with                                                                                                             ���

γ = 0.1 a constant. What competing constructions c j need to                     ����

                                                                                                                                                                          ���




                                                                                                                                                                                �����������������������
be decreased? First of all, all constructions that started off                   ����
                                                                                                                                                                          ���
a wrong branch in the search space during comprehension,
i.e. a branch which is not on the path to the final solution.                    ����
                                                                                                                                                                          ���


Next, the listener produces himself the utterance based on the                                                                                                            ���

                                                                                 ����
meaning deduced from the comprehension process and then                                                                                                                   ��


finds all constructions that start off a wrong branch while pro-                   ��                                                                                     ��
                                                                                        ��   ����      ����   ����        ����          ����   ����      ����          ����
ducing. Their scores need to be decreased as well. We use the                                                         ������������
                                                                                               ���������      �����������������������          ���������������������
following update rule: σc j ← σc j (1 − γ).

                            Results                                         Figure 4: Grammar learning with lexical categories known. 800
                                                                            language games are shown (x-axis). The learner reaches the same
We now report on two (of many more) experiments exercis-                    number of grammatical constructions (namely 30 on right y-axis)
ing the cognitive architecture in Figure 1 and the meta- and                and a total alignment (left y-axis), demonstrating that he success-
learning-operators from the previous section. We have imple-                fully acquired the grammar. The shading around the lines represent
                                                                            the confidence interval of 5 runs.
mented a tutor which is initialized with a lexicon of 40 lex-
ical constructions and a grammar with 30 grammatical con-
structions. The tutor grammar includes adverbs, adjectives,                 Experiment 2. Multiple semantic categories
nouns, verbs, prepositions, pronouns and relative pronouns                  The second experiment assumes that the learner does not
as well as noun phrases of different levels of complexity, verb             know any a priori lexical categories. This obviously makes
phrases, main clauses and relative clauses. The tutor produces              the learning problem harder. Also, grammatical constructions


                                                                      262
can have more than one semantic category per slot, which                                                                                                     Acknowledgments
means that constructions can be more general. They still                                                                                  The research reported here has been funded by an ICREA Re-
have only one syntactic category for slots, whereas individ-                                                                              search fellowship to LS and a Marie Curie Integration Grant
ual words can have several lexical categories (syncretism).                                                                               EVOLAN. The project has received further funding from the
Because constructions can be more general, we end up with a                                                                               European Union’s Seventh Framework Programme for re-
smaller inventory.                                                                                                                        search, technological development and demonstration under
   Results for inventory size and alignment are shown in Fig-                                                                             grant agreement no 308943, which is the FET-OPEN Insight
ure 5. The graph also shows syntactic ambiguity (the num-                                                                                 project. We are indebted to Paul Van Eecke for comments.
ber of superfluous branches when applying constructions plus
the number of possible combinations in the semantic meta-                                                                                                        References
operators divided by the number of words in the utterance),                                                                               Bannard, C., Lieven, E., & Tomasello, M. (2009). Modeling
and semantic ambiguity (the number of situations considered                                                                                 children’s early grammatical knowledge. PNAS, 106(41),
but cut off in erroneous branches divided by the number of                                                                                  17284-9.
variables introduced by all words). Thanks to grammar both                                                                                Bod, R., & Scha, R. (2003). Data-oriented parsing. Stanford,
types of ambiguity get drastically reduced, which proves that                                                                               Palo Alto: Center for the Study of Language and Informa-
an important function of grammar is to dampen combinatorial                                                                                 tion.
search.                                                                                                                                   Bybee, J. (2006). From usage to grammar: the mind’s re-
                                                                                                                                            sponse to repetition. Language, 82(4), 711–733.
       ��
                                        �������������������
                                                                                                    ���
                                                                                                                                          Chang, N. (2008). Constructing grammar: A computational
                                                                                                                                            model of the emergence of early constructions. Berkeley:
     ����                                                                                           ���
                                                                                                                                            Computer Science Division, University of Berkeley.
                                                                                                          �����������������������




     ����                                                                                           ���
                                                                                                                                          Goldberg, A. E. (2006). Constructions at work, the nature
                                                                                                                                            of generalization in language. Great Clarendon Street, Ox-
     ����                                                                                           ���
                                                                                                                                            ford: Oxford University Press.
     ����                                                                                           ��
                                                                                                                                          Koehler, W. (1956). The mentality of apes. New York: Rout-
                                                                                                                                            ledge and Kegan Paul.
       ��
            ��     ����       ����   ����       ����         ����   ����          ����
                                                                                                    ��
                                                                                                 ����
                                                                                                                                          Langacker, R. W. (2000). A dynamic usage-based model. In
                                            ������������
                     ���������       ���������������������                 �������������������
                                                                                                                                            M. Barlow & S. Kemmer (Eds.), Usage-based models of
        �����������������������         ������������������
                                                                                                                                            language (pp. 1–63).
                                                                                                                                          Ohlsson, S. (1984). Restructuring revisited: Ii. an informa-
Figure 5: Grammar learning without lexical categories known. The                                                                            tion processing theory of restructuring and insight. Scandi-
learner reaches fewer grammatical constructions (25) and a total
alignment, although this takes longer compared to Figure 4.                                                                                 navian Journal of Psychology, 25, 117-129.
                                                                                                                                          Patrick, J., & Ahmed, A. (2014). Facilitating representation
                                                                                                                                            change in insight problems through training. Journal of Ex-
                                                                                                                                            perimental Psychology. Learning Memory and Cognition,
                                     Conclusion                                                                                             40(2), 532-543.
The paper explored the hypothesis that usage-based gram-                                                                                  Spranger, M., Loetzsch, M., & Steels, L. (2012). A Percep-
mar learning can be modeled as an insight problem solving                                                                                   tual System for Language Game Experiments. In L. Steels
process. The computational simulations showed that this is                                                                                  & M. Hild (Eds.), Language Grounding in Robots (pp. 89–
possible requiring (i) a strong semantic component to make                                                                                  110). Springer.
meaning-driven learning possible, (ii) a meta-level architec-                                                                             Steels, L. (Ed.). (2011). Design patterns in Fluid Construc-
ture similar to SOAR or ACT-R and (iii) appropriate learning                                                                                tion Grammar. Amsterdam: John Benjamins.
and alignment operators.                                                                                                                  Steels, L., & Hild, M. (Eds.). (2012). Language grounding in
                                                                                                                                            robots. New York: Springer.
   Of course, the model is much simpler compared to the
                                                                                                                                          Tomasello, M. (1992). First verbs: A case study of early
full complexity of the resources that human language learners
                                                                                                                                            grammatical development. Cambridge: Cambridge Uni-
bring to bear on language acquisition. However, the advan-
                                                                                                                                            versity Press.
tage of a simpler system is that the minimal characteristics
of a language learner become more starkly visible. More-
over this method allows us to causally examine the impact
of operators, and thus supports empirical research into which
operators are available to human language learners. The sim-
ulations have already been scaled up to sentences with much
higher complexity, including full recursion, and will be tested
against corpora of child-directed speech, such as CHILDES,
in the future.


                                                                                                                                    263