=Paper= {{Paper |id=Vol-1648/paper6 |storemode=property |title=Grammar Induction as Automated Transformation between Constraint Solving Models of Language |pdfUrl=https://ceur-ws.org/Vol-1648/paper6.pdf |volume=Vol-1648 |authors=Ife Adebara,Veronica Dahl |dblpUrl=https://dblp.org/rec/conf/ijcai/AdebaraD16 }} ==Grammar Induction as Automated Transformation between Constraint Solving Models of Language== https://ceur-ws.org/Vol-1648/paper6.pdf
    Grammar Induction as Automated Transformation between Constraint Solving
                             Models of Language ∗

                                       Ife Adebara and Veronica Dahl
                            Computing Sciences Department, Simon Fraser University
                                     iadebara@sfu.ca,veronica@cs.sfu.ca


                         Abstract                                language [David Burkett and Dan Klein, 2008], or fixing mor-
                                                                 phological or syntactic differences by modifying tree-based
     We specialize an efficient while linguistically savvy       rules [Lionel Nicolas and Miguel A. Molinero and Benoı̂t
     constraint solving model of grammar induction-              Sagot and Elena Sánchez Trigo and de La Clergerie, Éric and
     Womb Grammars (WG) -, into an interesting spe-              and Jacques Farré and Joan Miquel Vergés, 2009]—rather
     cific application: that of inferring the grammar of         than for syntax induction.
     an under-resourced language -Yorùbá- from that
                                                                    This usually requires parallel corpora, an interesting excep-
     of English, through grammatical model transfor-
                                                                 tion being [Shay B. Cohen and Noah A. Smith, 2010], where
     mation. The model represents both the known
                                                                 information from the models of two languages is shared to
     grammar and the grammar to be inferred in terms
                                                                 train parsers for two languages at a time, jointly. This is
     of constraints, or properties, between pairs of
                                                                 accomplished by tying grammar weights in the two hidden
     constituents. This allows us to view parsing as
                                                                 grammars, and is useful for learning dependency structure in
     constraint solving, and to use the parser’s out-
                                                                 an unsupervised empirical Bayesian framework.
     put (namely, the information of which constraints
     failed and which were satisfied) as a guideline on             Our novel approach, in contrast, works for more than
     how to transform the known grammar model of                 just specific tasks such as disambiguation, and needs nei-
     English into the (unknown to the system) gram-              ther a pre-specified model family, nor parallel corpora, nor
     mar model of Yorùbá. Interesting extensions to the        any of the typical models of machine learning. It proceeds
     original Womb Grammar model are presented, mo-              instead through automatically transforming a given (task-
     tivated from the specific needs of Yorùbá and sim-        independent) grammar description of a source language, from
     ilar tonality infused languages. Our methodology            just the lexicon of the target language plus a representative in-
     is implemented in Constraint Handling Rule Gram-            put set of correct phrases in the target language.
     mars (CHRG) and has been used so far for inducing              Both descriptions (the syntax of the source and of the tar-
     the grammar model of a useful subset of Yorùbá            get language subset addressed) are stated in terms of linguis-
     noun phrases.                                               tic constraints (also called “properties” in the linguistic liter-
                                                                 ature) between pairs of constituents, although for the target
                                                                 language constraints we depart from the classic formalism
1    Introduction                                                [Blache, 2005] in view of Yorùbá motivated extensions. This
                                                                 choice allows us a great degree of modularity, in which con-
Grammar induction has met with reasonable success using
                                                                 straints can be checked efficiently through constraint solving.
different models of grammar: a) as a parametrized, genera-
tive process explaining the data [Fernando C. N. Pereira and        From the perspective of problem solving, our modelling
Yves Schabes, 1992; Dan Klein and Christopher D. Manning,        framework thus compares favourably with others: it com-
2004], b) as a probability model, so that learning a grammar     bines linguistic formality with efficient problem solving, and
amounts to selecting a model from a pre-specified model fam-     can transfer into other languages, in particular languages that
ily [Eugene Charniak and Mark Johnson, 2005; CMengqiu            use tones to differentiate meaning.
Wang and Noah A. Smith and Teruko Mitamura, 2007; Shay
B. Cohen and Noah A. Smith, 2010], and c) as a Bayesian          2   Motivation
model of machine learning [William P. Headden and Mark
Johnson and David McClosky, 2009].                               Close to seven thousand languages are currently spoken in the
   Using linguistic information from one language for the task   world, the majority of which are understudied. Linguists can-
of describing another language has also yielded good results,    not keep up with their study even for educational purposes,
albeit for specific tasks—such as disambiguating the other       and there is a growing need for their automatic processing
                                                                 as well, since the amount of text sources grows much faster
   ∗ This research was supported by NSERC Discovery grant        than humans can process them. To make matters worse, most
31611024                                                         linguistic resources are poured into English and a handful of
other first world languages, leaving the vast majority of lan-         also produce some indication of the sentence’s degree of ac-
guages and dialects under-explored. Clearly, automating the            ceptability by analyzing the failed properties.
discovery of an arbitrary language’s grammar model would                  The PG formalism presently comprises the following seven
render phenomenal service to the study and preservation of             categories (we adopt the handy notation of [Duchier, Dao,
linguistic diversity.                                                  and Parmentier, 2013], and the same example):
   Scientifically, we wanted to explore to what extent the
                                                                       Constituency A : S, children must have categories in the set
parsing-as-constraint-solving paradigm of Natural Language
                                                                           S
Processing (NLP) problem solving could buy us a great de-
gree of linguistic descriptive formality without sacrificing ef-       Obligation A : 4B, at least one B child
ficiency, in the realm of grammar induction and in particular          Uniqueness A : B !, at most one B child
for inducing Yorùbá, which is severely under-resourced and
which one of the authors has expertise in.                             Precedence A : B ≺ C, B children precede C children
   Yorùbá belongs to the Yoruboid group of the Kwa branch            Requirement A : B ⇒ C, if B is a child, then also C is a child
of the Niger-Congo language family, which cuts across most
                                                                       Exclusion A : B 6⇔ C, B and C children are mutually exclu-
of sub-Saharan Africa. It is a tonal dialect-continnum com-
                                                                           sive
prising about 20 distinctive dialects and spoken by over 30
million people in the western part of Nigeria [Fagborun,               Dependency A : B ∼ C, the features of B and C are the same
1994].                                                                 Example 1. For example, if we denote determiners by D,
                                                                       nouns by N, personal nouns by PN, verbs by V, noun phrases
3    Background                                                        by NP, and verb phrases by VP , the context free rules NP →
                                                                       D N and NP → N, which determine what a noun phrase is,
Among the linguistic theories that lend themselves the most            can be translated into the following equivalent constraints:
to constraint-based implementation are those that split the in-        NP : {D, N}, NP : D !, NP : 4N, NP : N !, NP : D ≺ N, D : {},
formation previously packed in one rewriting rule into sev-            N : {}.
eral constraints or properties. These constraint based or
property-based theories, such as Property Grammars (PG)
[Blache, 2005] evolved from Immediate Dominance / Lin-                 4    The Intuitive Idea
ear Precedence (IDLP), which unfolds a rewrite rule into the           By giving up syntactic trees as a focus and focusing instead
two constraints of immediate dominance (expressing which               on grammar constraints, we can arrive at more modular, more
categories are allowable daughters of a phrasal category) and          flexible and less costly models. To see this, consider that if
linear precedence (expressing which of the daughters must              we were to work with tree-oriented rules such as:
precede which others).
   For example in the PG framework, English noun phrases               noun_phrase --> determiner, noun.
can be described through a few constraints such as prece-                 such rules would fail for languages such as Yorùbá, where,
dence (a determiner must precede a noun, an adjective must             for instance, nouns precede determiners, and more than one
precede a noun), uniqueness (there must be at most one deter-          determiner can accompany a noun. For instance, ilé yı̀ı́ náà
miner), exclusion (an adjective phrase must not coexist with a         wó (house this the collapse) means some particular house in
superlative), obligation (a noun phrase must contain the head          close proximity to the speaker collapsed, whereas ilé kan náà
noun), and so on. Instead of resulting in either a parse tree          wó (house a the collapse) means a house the speaker may not
or in failure as traditional parsing schemes do, such frame-           be familiar with collapsed.
works characterize a sentence through the list of the con-                Transforming a tree-oriented model into a language model
straints a phrase satisfies and the list of constraints it violates,   for Yorùbá would require changing every rule where these
so that even incorrect or incomplete phrases will be parsed.           two constituents are involved, as well as creating a grammar
Moreover, it is possible to relax some of the constraints by           symbol that covers sequences of determiners rather than just
declaring relaxation conditions in modular fashion. [Dahl and          a single one.
Blache, 2004] encodes the input PG into a set of CHRG rules               In a constraint-based model, in contrast, the needed trans-
that directly interpret the grammar in terms of satisfied or re-       formation can be expressed in terms of separate constraints:
laxed constraints, which are then propagated while a syntactic         we need to replace the English precedence constraint with
tree is built as a side effect. Womb Grammars- a recent adap-          its converse (in a noun phrase, a noun must precede a de-
tation of this framework into grammar transformation [Dahl             terminer) 1 , and delete the constraint that in a noun phrase,
and Miralles, 2012]- automates as well the induction of a lan-         determiners must be unique. These modifications carry over
guage’s syntax from that of another, focusing on failed con-           to the entire grammar without further ado.
straints.                                                                 This intuition leads directly to that of our WG transforma-
   In such theories, the modularity obtained by splitting gram-        tion methodology: first we implement an executable version
matical information apart into constraints leads naturally to          (i.e., a parser) of our constraint-based grammar model of En-
more robust parsers, since it allows us to clearly identify from       glish, then instead of running English phrases by it, we feed it
the parser’s output which constraints are satisfied and which          a Yorùbá input phrase. Given that we have a Yorùbá lexicon,
fail, which allows us to accept even incomplete or incorrect
sentences, instead of silently failing to parse them. We can               1 In fact a subtler algorithm is needed- this will be covered later.
this parses into a pre-lexical string (e.g. “noun, adjective, de-                                         T                   S
                                                                                                         Lcorpus     T
                                                                                                                    Llex     Lsyntax
terminer”). Obviously, many of the English constraints (in
particular, for example, the English precedence constraints)
will fail to hold for it. Examining such failures with respect
to a representative set of input phrases in Yorùbá, our system
will be able to determine whether for Yorùbá we must e.g.                                                  WG
induce the converse of an English precedence constraint (if it                                              Parser
were the case that the converse ordering is found in every sin-
gle Yorùbá phrase), or simply delete the English constraint (if
it were the case that both orderings are allowed), or make it                                           Violated syntax
conditional to some linguistic feature being there (if in some                                            properties
cases one ordering routinely held, whereas in other cases it
didn’t), etc.
                                                                                                         Model Transformation
5     Our methodology                                                                                          Module
5.1    Constraint modification- The WG Paradigm                            T    T              S
                                                                          Llex Lsyntax ?      Lsyntax
Just as its ancestor, Property Grammar, the WG paradigm de-                                                         T
                                                                                                                   Lsyntax
scribes a language’s phrases in terms of constraints between
pairs of direct daughters of a phrasal category. The classic
constraints are constituency (which constituents are allowable             Figure 1: The Problem         Figure 2: The Solution
as immediate daughters of a phrasal category), precedence
(in a given phrase, a certain daughter must precede a certain       An Example
other daughter), obligation (a certain daughter must occur in a     Let LS = English and LT = Yorùbá, and let us assume that En-
given phrase), uniqueness (if a daughter can occur only once        glish determiers always precede the noun they modify, while
in a given phrase), requirement (a certain daughter is required     in Yorùbá they always post-cede it (an oversimplification, just
in a given phrase), dependency (same features must be shared        for illustration purposes). Thus “a red book” is correct En-
between two daughters of a given phrase), and exclusion (if         glish, whereas in Yorùbá we would more readily say “iwe
a certain daughter occurs in a given phrase, another certain        pupa kan” (book, red, a).
daughter may not co-occur).                                            If we plug the Yorùbá lexicon and the English syntax con-
    WG extends the parsing capabilities implicit in these prop-     straints into our WG parser, and run a representative cor-
erties into a model of grammatical induction, in addition to        pus of (correct) Yorùbá noun phrases by the resulting hybrid
parsing. In their first incarnation [Dahl and Miralles, 2012]       parser, the said precedence property will be declared unsatis-
they allowed greater efficiency by focusing on failed con-          fied when hitting phrases such as “ı̀wé pupa kan”. The model
straints only, rather than explicitly calculating all successful    transformation module can then look at the entire list of un-
and unsuccessful constraints. However in adapting the model         satisfied constraints, and produce the missing syntactic com-
into Yorùbá , we have found it more efficient to explicitly       ponent of LT ’s parser by modifying the constraints in Lsyntax
                                                                                                                               S
calculate satisfied constraints as well, as we go along, This is    so that none are violated by the corpus sentences.
partly because in order to arrive at as nuanced a description as       Some of the necessary modifications are easy to identify
needed for Yorùbá , we had to extend the classical Property       and to perform, e.g. for accepting “ı̀wé pupa kan” we only
Grammar model to accommodate what we call conditional               need to delete the (English) precedence requirement of deter-
constraints: those whose satisfaction depends on the run-time       miner over noun (noted det < n). However, subtler modifi-
values of combinations of features (more on this later,)            cations may be in order, after some statistical analysis in a
    The general WG model can be described as follows: Let LS        second round of parsing: if in our LT corpus, which we have
be the source language. Its syntactic component will be noted       assumed representative, all determiner appear after the noun
  S
Lsyntax . Likewise, we call the target language LT , its lexicon    they modify, Yorùbá is sure to include the reverse precedence
   T                     T
(Llex ) and its syntax Lsyntax . If we can get hold of a suffi-     property as in English: n < det. So in this case, not only do
ciently representative set of phrases in LT that are known to       we need to delete det < n, but we also need to add n < det.
be correct (a set where our desired subset of language is rep-
resented), we can feed these to a hybrid parser consisting of       5.2     Constraint Solving as the implementation
  S
Lsyntax       T . This will result in some of the sentences be-
         and Llex                                                           means
ing marked as incorrect by the parser. An analysis of the con-      Constraint Satisfaction has yielded powerful results in many
straints these “incorrect” sentences violate can subsequently       AI areas, including human language processing. To a large
                             S
reveal how to transform Lsyntax     so it accepts as correct the    extent, this powerful problem-solving paradigm has allowed
sentences in the corpus of LT —i.e., how to transform it into       us to overcome the disjoin that prevailed until relatively re-
  T
Lsyntax  by modifying the constraints that were violated into       cently between languages for problem description and those
constraints that accept the input. Figures 1 and 2 respectively     for problem solving. Since the advent of CHR [Frühwirth,
show the problem and our solution in schematic form.                1998] and of its grammatical counterpart CHRG [Chris-
tiansen, 2005], constraint-based linguistic formalisms can         this simpagation rule is triggered when a word of category C2
materialize through fairly direct methodologies.                   comes before one of category C1, given the existence of the
   In a nutshell, CHRG rules rewrite constraints into other        grammar constraint that C1 must precede C2 2 . Each of the
constraints, subject to possible checks described in a rule’s      properties dealt with has similar rules associated with it.
guard and stated as Prolog calls. Their general format is:
                                                                   5.3   Main Modifications to the Model resulting
             α -\ β /- γ :: > G | δ.                                     from our Application to Yorùbá
   This rule is called a propagation rule. The part of the rule    Conditional Properties
preceding the arrow ::> is called the head, G the guard, and       The original constraint based model, as we have seen, suc-
δ the body; α, β, γ, δ are sequences of grammar symbols and        ceeded in detecting and signalling mistakes in the sentence,
constraints so that β contains at least one grammar symbol,        without blocking the analysis. In this first model, which
and δ contains exactly one grammar symbol which is a non-          is parsing-oriented, incorrect sentences could be “accepted”
terminal (and perhaps constraints); α (γ) is called left (right)   through declaring some constraints as relaxable. For instance,
context and β the core of the head; G is a conjunction of built-   while from the context-free grammar rules shown in Section
in constraints as in Constraint handling rules (CHR) and no        3 we wouldn’t be able to parse “the the book” (a common
variable in G can occur in δ. If left or right context is empty,   mistake from cutting and pasting in word processors), in the
the corresponding marker is left out and if G is empty (inter-     constraint-based formulation we can if we relax the unique-
preted as true), the vertical bar is left out. The convention      ness of determiner constraint.
from Definite clause grammars (DCG) is adopted that con-              Relaxation can be made conditional (e. g. a head noun’s re-
straints (i.e., non-grammatical stuff) in head and body of a       quirement for a determiner can be made relaxable in case the
rule are enclosed by curly brackets. Gaps and parallel match       head noun is generic and in plural form, as in “Lions sleep
are not allowed in rule bodies. A gap in the rule heads is noted   tonight”). The failure of relaxable constraints is signalled in
“...”. Gaps are used to establish references between two long      the output, but does not block the entire sentence’s analysis.
distant elements.                                                  Implementations not including constraint relaxation capabil-
   A simplification (grammar) rule is similar to a propagation     ities implicitly consider all properties as relaxable. And in
rule except that the arrow is replaced by <:>. A simpaga-          fact, when exploiting constraint-based parsing for grammar
tion (grammar) rule is similar to a simplification rule, how-      transformation rather than for parsing, this is exactly what we
ever one or more grammar symbols is prefixed with ! which          need, since in an unknown language any constraint may need
means that the grammar symbols should not be removed from          to be “relaxed” and even of course, corrected.
the constraint store. Details of the CHRG encoding, how to            For that reason, we have considered it more appropriate,
download the CHRG system and how to use the encoding can           in the context of grammar induction, to state “exceptions” as
be found in the CHRG manual [Christiansen, 2010].                  part of a property itself, rather than separately in a relaxation
   Implementing the WG paradigm through constraint solv-           definition. Thus we have created conditional properties, e.g.
ing allow us to both express and test linguistic constraints in    “precedence(pronoun,noun)” which occurs if a pronoun pre-
very modular fashion, and results in succinct while executable     ceeds a noun, is plural, is a personal pronoun and can be used
code.                                                              in place of both animate and inanimate nouns.
   In our CHRG implementation the appropriate WG con-                 We have also used conditional properties to express choice,
straints are entered in terms of a constraint g/1, whose           e.g. whereas before we could state that one element was
argument stores each possible grammar property. For                obligatory in a phrase, we can now state that a property
instance, our English grammar hybrid parser for noun               needs to have one of a list of constituents. E.g., the
phrases includes the constraints:           g(obligatority(noun,   g(obligatority(noun, pronoun, proper noun)) constraint only
pronoun, proper noun)), g(constituency(determiner)),               needs to find one of noun, pronoun or proper noun to suc-
g(precedence(determiner,adjective)),                               ceed. If none of these is found, then a failure occurs which
g(unicity(determiner)), g(requirement(noun, determiner)),          indicates that an obligatory head is absent.
g(dependence(determiner,noun)), and so on. These proper-
                                                                   From Failure Driven to Success-and-Failure driven
ties are weeded out upon detection of a violation by CHRG
rules that look for them, e.g. an input noun phrase where          In view of unprecedented efficiency where property grammar
an adjective precedes a noun may provoke deletion of the           related models were concerned, the original WG model cal-
constraint g(precedence(noun,adjective)) when the following        culated only failed properties, on the assumption that these
CHRG rule applies:                                                 could be trusted to be a complement of the successful prop-
                                                                   erties, thus obviating the need to calculate both explicitly.
!word(C2,F1,_),...,!word(C1,F2,_),                                 However our more in depth analysis in the context of Yorùbá
{g(precedence(C1,C2))}<:>                                          has, as we have seen, uncovered the need for more nuances
{fail(precedence(C1,C2), F1, F2)}.                                 than simply failing or succeeding, as in the case of condi-
                                                                   tional properties. We therefore now use a success driven and
   Each word is stored in a CHRG symbol word/3, along                 2 Recall that in CHRG syntax the symbols prefixed with excla-
with its category and features (i.e. word(noun,[neutral, n/a,      mation points are kept, while the ones without are replaced by the
common, inanimate],aso)). Since the CHRG parse predicate           body of the rule, in this case an update constraint that invokes some
stores and abstracts the position of each word in the sentence,    housekeeping procedures
failure driven approach for inducing the grammar of our tar-         two syllables with a high and low tone on each syllable re-
get language, Yorùbá. Each input phrase of the target lan-         spectively is stored as “high, mid” in the features of the word.
guage is tested with all relevant constraints for both failure       For instance:
and success. This makes the model slightly less efficient than           [tútù]::> word(ad jective, [neutral, n/a, descriptive, both],
if we only were to calculate failed properties, but of course        tútù).
the gain is in accuracy. Efficiency is still guaranteed by the           These tones are used to infer conditional properties in the
normal CHRG way of operating: rules will only trigger when           second phase of parsing, so that if a property is found to occur
relevant, e.g. if a phrase is comprised of only a noun and           around words with certain tones, this property will be condi-
an adjective, it will not be tested with for instance prece-         tioned on the presence of such tones. It is important to note
dence(pronoun, determiner) or any other constraint whose             that the tones are not used in isolation of other features and
categories are different from those of the input phrase. We          a property will be said to be conditioned on tones if, tones
keep a list of all properties that fail and another for those that   are the only unique features present. The addition of tones
succeed together with the features of the categories of each         makes our approach extensible to many other languages in-
input phrase and their counts. It is important to state that         cluding tone languages.
constituency constraints are tested only for success. This is            In addition to the afore mentioned, the system provides a
because we are interested in checking that our target gram-          record of the English translation of every word in the input
mar shares similar constituents with our source language and         phrases and this record is consulted during parsing to produce
testing for failure will be irrelevant for these constraints.        the English equivalent for each word in the input phrase.
Model Transformation Module                                          5.4   Sample Output
After the first round of parsing, we do a statistical analy-
                                                                     Here is a subset of our induced Yorùbá grammar:
sis in a second round of parsing where we determine which
constraints should be induced as a property or a conditional         -precedence(pronoun,noun):-
property of our target language and those which need to be           pronoun preceeds noun if the pronoun is
weeded out. First we determine which constraints should              plural, personal and can be used in place of
be added to the grammar by finding all constraints that suc-         both animate and inanimate nouns
ceed in all relevant phrases. We are correct in doing so
since, our input phrase is correct and representative of the         -precedence(adjective,noun):-
target language. We also add the converse of any prece-              adjective preceeds noun if the adjective is
dence rule that fails in all relevant phrases so that for in-        plural, quantity-uncountable and can be used with
stance, if precedence(pronoun, determiner) fails in all rele-        both animate and inanimate nouns
vant phrases, precedence(determiner, pronoun) is induced as
a property of our target language. Next, we decide which             -obligatority((noun;proper-noun;pronoun)):-
constraints should be conditional properties. All constraints        noun;proper-noun;pronoun are obligatority.
that fail and succeed are tested in this phase. We find these        It succeeds in all 46 input phrases.
properties by searching for features which are unique to the         noun succeeds in 37 relevant phrases;
failure and or success of these constraints. Any constraint          pronoun succeeds in 5 relevant phrases;
where such features are found are added to the grammar as            proper-noun succeeds in 4 relevant phrases;
a conditional property, with its conditions i.e features clearly
stated. All other constraints which are not induced by the           -precedence(pronoun,adjective):-
above rules are not added to the grammar of Yorùbá.                pronoun preceeds adjective in all 4 relevant
                                                                     phrases
Inducing Constraints not present in the Original
Language                                                             -precedence(quantifier,adjective):-
We also added an additional function that finds all precedence       quantifier preceeds adjective in all 4 relevant
constraints that are absent in our source but present in our tar-    phrases
get language. We are equally able to test these constraints as
above by checking if such a constraint succeeds in all relevant      -precedence(quantifier,pronoun):-
phrases and if the converse succeeds in any input phrase. If         quantifier preceeds pronoun in all 4 relevant
we find that the converse does not succeed in any of the input       phrases
phrases, this constraint is induced as a property of the target
language, else we search for a unique feature as above.              -precedence(pronoun,determiner):-
                                                                     pronoun preceeds determiner in all 7 relevant
Using Tonality to Induce Conditional Properties                      phrases
Yorùbá is a tone language, therefore it was imperative to en-
sure that tones are properly represented. We adopt a full tone       -precedence(proper-noun,adjective):-
marking approach so that each word is marked with their              proper-noun preceeds adjective in all 3 relevant
tones in order to avoid ambiguities. The tones are also stored       phrases
as features, so that for instance the word tútù(cold) that has
-precedence(proper-noun,determiner):-                                      given that some users’ intuitions about what is correct in
proper-noun preceeds determiner in all 3 relevant                          their language (and indeed, about even whether they use
phrases.                                                                   it “correctly” or not) may not be accurate.

-pronoun is a constituent of noun phrases. It                         8   Concluding Remarks
occurs in 12 phrases.
                                                                      We have presented an implemented model for integrating
-adjective is a constituent of noun phrases. It                       grammatical knowledge representation and reasoning around
occurs in 26 phrases.                                                 the CHRG constraint paradigm of problem solving. Our
                                                                      model extends the original grammar induction formalism,
                                                                      WG, in ways motivated by the needs of our particular ap-
The first two properties are conditional properties and the fea-
                                                                      plication: the automatic transformation of an English model
tures of these properties are explictly stated. The obligatority
                                                                      of syntax into that of Yorùbá.
constraint which represents the head of phrases consists of
                                                                         We have also presented a proof-of-concept system that al-
nouns, proper-nouns and pronouns. We provide further in-
                                                                      lows users to input a source language’s grammar’s description
formation on the number of times each of these categories
                                                                      in modular and declarative fashion, in terms of the linguis-
occur as the head of the phrase. We do not output the features
                                                                      tic constraints that are supported (constituency, precedence,
for other properties except the conditional properties because
                                                                      dependency, obligatority, requirement, exclusion, unicity).
they already succeed in all relevant phrases.
                                                                      Since such descriptions also stand alone as a purely linguis-
                                                                      tic model of a language’s syntax, our system can cater even
6     How can our model be acquired?                                  for users who are not conversant with coding, such as pure
Our model for grammar induction is in a sense, already ac-            linguists wanting to perform syntactic model transformation
quired by many other possible source-target language pairs            experiments aided by computers. Their purely (constraint-
(those that are amenable to description through the linguis-          based) linguistic descriptions of syntax automatically turn
tic system of constraints that we accept), because rather             into executable code when appended to our own code.
than developing a language-dependent implementation of our               As we have seen, our system automatically transforms a
model, we have implemented a generator of grammar induc-              user’s syntactic description of a source language into that of
ers which, to work with a different pair of languages than            a target language, of which only the lexicon and a set of rep-
the one we’ve chosen, requires only: a) a representative sam-         resentative sample phrases are known. While demonstrated
ple of target language phrases, b) that the source grammar            specifically for English as source language and Yorùbá as tar-
be expressed in terms of the linguistic constraints accepted          get language, our implementation can accept any other pair of
in our model (namely, obligatority, precedence, constituency,         languages for inducing the syntactic constraints of one from
requirement, dependency, uniqueness and exclusion), and c)            that of the other, as long as their description can be done in
that the target grammar’s lexicon be described in our user-           terms of the supported constraints.
friendly provided plain text format, for the subset of words             We have thus contributed a flexible modelling frame-
used in the input set of target language phrases.                     work solidly grounded in linguistic knowledge representation
                                                                      which, through the magic of constraint solving, turns into an
7     Verification and validation                                     efficiently executable problem solving tool. Through its in-
Verifying and validating that our results are correct comports        herent modularity, our model can support revision and adap-
two tasks:                                                            tation quite straightforwardly: new constraints between im-
                                                                      mediate daughters of any phrase can be modularly added and
    • checking that the input and the output syntax descrip-          implemented, without touching the rest of the system.
      tions are correct. Since these descriptions are done in            Also, our model can accommodate solution revisions at
      terms of the linguistic Property grammar formalism, it          execution time by interacting with a linguist whenever con-
      may be challenging for users unfamiliar with it to be sure      straints can not be induced consistently. Visual representa-
      that it covers exactly the subset of language addressed.        tions of the output that would be helpful for linguists in this
      However there is a mechanical procedure to transform            respect have been proposed in [Adebara I., Dahl V., 2015]
      context-free grammars into property grammars [Philippe          and is yet to be implemented.
      Blache, Stephane Rauzy, 2012], so if the user is more fa-          Other than integrating knowledge-based and efficiency-
      miliar with context-free grammars, and the subset of lan-       based paradigms of problem solving, our approach demon-
      guage allows it, we can at least start from a known, cor-       strates that constraint programming is not just about solv-
      rect Context Free (CF) version and apply the said proce-        ing classic OR problems. It is also a powerful tool for
      dure, either by hand or mechanically, in order to arrive at     addressing interdisciplinary applications [Dahl, Miralles, and
      an equivalent formulation in terms of our linguistic con-       Becerra, 2012; Becerra, Dahl, and Miralles, 2013; Becerra,
      straints. Alternatively, we can consult a linguist, if one      Dahl, and Jiménez-López, 2014; Adebara, Dahl, and Tes-
      is available who specializes in the involved languages.         saris, 2015; Adebara I., Dahl V., 2015].
    • checking that the set of input phrases from the target lan-        Further interesting ramifications of our work would in-
      guage is correct and representative. This requires at the       clude: testing our system for a larger coverage of syntax (we
      very least a native speaker, and if possible also a linguist,   have only addressed noun phrases so far), testing it for other
tonal-sensitive languages, studying how assimilation influ-      Cohen S.B. and Smith N.A. 2010. Covariance in Unsuper-
ences language structure especially in tone languages, study-        vised Learning of Probabilistic Grammars. Journal of
ing whether any further constraints or approach extensions           Machine Learning Research11:3017–3051
would be needed to accommodate families of languages not         Dahl, V., Blache, P. 2004. Directly Executable Constraint
easily describable within our approach (e.g. languages which         Based Grammars. In Proc. Journees Francophones de
have different categories from the source language, those who        Programmation en Logique avec Contraintes.
have different inflectional paradigms from the source lan-
guage, those that exhibit constraints not among our allowable    Dahl, V., and Miralles, J. E. 2012. Womb grammars: Con-
set of constraints).                                                 straint solving for grammar induction. Sneyers, J., and
                                                                     Frühwirth, T., eds., In Proceedings of the 9th Workshop
Acknowledgments                                                      on Constraint Handling Rules, volume Technical Report
                                                                     CW 624, 32–40.
This research was supported by V. Dahl’s NSERC Discovery
grant 31611024.                                                  Dahl, V.; Miralles, E.; and Becerra, L. 2012. On language ac-
                                                                     quisition through womb grammars. In 7th International
                                                                     Workshop on Constraint Solving and Language Process-
References                                                           ing, 99–105.
Adebara I., Dahl V. 2015. Domes as a Prodigal Shape
     in Synthesis-Enhanced Parsers. In SHAPES 3.0, Con-          Duchier, D.; Dao, T.-B.-H.; and Parmentier, Y. 2013. Model-
     straints and Logic Programming (CSLP’04).                       Theory and Implementation of Property Grammars with
                                                                     Features. Journal of Logic and Computation 19.
Adebara I., Dahl V. 2015. Shape Analysis as an Aid for
     Grammar Inductions. In SHAPES 3.0, Constraints and          Fagborun, O. 1994 The Yor‘uba Koine - its history and lin-
     Logic Programming (CSLP’04).                                    guistic innovations. Munchen : LINCOM EUROPA.
Adebara, I.; Dahl, V.; and Tessaris, S. 2015. Complet-           Frühwirth, T. W. 1998. Theory and practice of constraint han-
     ing mixed language grammars through womb grammars                dling rules. Journal. Logic. Programming. 37(1-3):95–
     plus ontologies. Henning Christiansen, M. D. J. L., and          138.
     Loukanova, R., eds., Proceedings of the International       Klein D. and Manning C.D. 2004. Corpus-Based Induction
     Workshop on Partiality, Underspecification and Natural           of Syntactic Structure: Models of Dependency and Con-
     Language Processing, 32–40.                                      stituency. In Association of Computational Linguistics
Becerra, L.; Dahl, V.; and Miralles, E. 2013. On second lan-          (ACL). 478–485.
     guage tutoring through womb grammars. 12th Interna-         Headden W.P. and Johnson M. and McClosky D. 2009. Im-
     tional Work-Conference on Artificial Neural Networks            proving Unsupervised Dependency Parsing with Richer
     (IWANN) 2013, June 12-14, Tenerife, Spain.                      Contexts and Smoothing. North American Chapter of
Becerra, L.; Dahl, V.; and Jiménez-López, M. D. 2014.              the Association for Computational Linguistics: Human
     Womb grammars as a bio-inspired model for grammar               Language Technologies (HLT-NAACL). 101–109.
     induction. In Trends in Practical Applications of Hetero-   Nicolas L. and Molinero M.A., Sagot B., Trigo E., de La
     geneous Multi-Agent Systems. The PAAMS Collection.              Clergerie, Éric, Farré J. and Vergés J.M. 2009. Towards
     Springer International Publishing. 79–86.                       efficient production of linguistic resources: the Victoria
Blache, P. 2005. Property grammars: A fully constraint-based         Project.
     theory. In Proceedings of the First International Confer-   Pereira F.C.N and Schabes Y. 1992. Inside-Outside Reesti-
     ence on Constraint Solving and Language Processing,              mation from Partially Bracketed Corpora. Association
     CSLP’04, 1–16. Berlin, Heidelberg: Springer-Verlag.              of Computational Linguistics (ACL). 128–135
Blache P., Rauzy S. 2012. Hybridization and Treebank En-         Wang M. and Smith N. A. and Mitamura T. 2007. What is
     richment with Constraint-Based Representations. Pro-           the Jeopardy Model? A Quasi-Synchronous Grammar
     ceedings of International Conference on Language Re-           for QA. Conference on Empirical Methods in Natural
     sources and Evaluation (LREC).                                 Language Processing and Computational Natural Lan-
Burkett D. and Klein D. 2008. Two Languages are Better              guage Learning (EMNLP-CoNLL). 22–32.
     than One (for Syntactic Parsing). In Empirical Methods
     for Natural Language Processing (EMNLP). 877–886.
Charniak E. and Johnson M. 2005. Coarse-to-Fine n-Best
     Parsing and MaxEnt Discriminative Reranking. In As-
     sociation for Computational Linguistics (ACL).
Christiansen, H. 2005. CHR grammars. Theory and Practice
     of Logic Programming (TPLP) 5(4-5):467–501.
Christiansen,     H.         2010.           CHRG manual.
     http://akira.ruc.dk/          henning/chrg/CHRGusers-
     Guide.html.