=Paper=
{{Paper
|id=Vol-1648/paper6
|storemode=property
|title=Grammar Induction as Automated Transformation between Constraint Solving
Models of Language
|pdfUrl=https://ceur-ws.org/Vol-1648/paper6.pdf
|volume=Vol-1648
|authors=Ife Adebara,Veronica Dahl
|dblpUrl=https://dblp.org/rec/conf/ijcai/AdebaraD16
}}
==Grammar Induction as Automated Transformation between Constraint Solving
Models of Language==
Grammar Induction as Automated Transformation between Constraint Solving Models of Language ∗ Ife Adebara and Veronica Dahl Computing Sciences Department, Simon Fraser University iadebara@sfu.ca,veronica@cs.sfu.ca Abstract language [David Burkett and Dan Klein, 2008], or fixing mor- phological or syntactic differences by modifying tree-based We specialize an efficient while linguistically savvy rules [Lionel Nicolas and Miguel A. Molinero and Benoı̂t constraint solving model of grammar induction- Sagot and Elena Sánchez Trigo and de La Clergerie, Éric and Womb Grammars (WG) -, into an interesting spe- and Jacques Farré and Joan Miquel Vergés, 2009]—rather cific application: that of inferring the grammar of than for syntax induction. an under-resourced language -Yorùbá- from that This usually requires parallel corpora, an interesting excep- of English, through grammatical model transfor- tion being [Shay B. Cohen and Noah A. Smith, 2010], where mation. The model represents both the known information from the models of two languages is shared to grammar and the grammar to be inferred in terms train parsers for two languages at a time, jointly. This is of constraints, or properties, between pairs of accomplished by tying grammar weights in the two hidden constituents. This allows us to view parsing as grammars, and is useful for learning dependency structure in constraint solving, and to use the parser’s out- an unsupervised empirical Bayesian framework. put (namely, the information of which constraints failed and which were satisfied) as a guideline on Our novel approach, in contrast, works for more than how to transform the known grammar model of just specific tasks such as disambiguation, and needs nei- English into the (unknown to the system) gram- ther a pre-specified model family, nor parallel corpora, nor mar model of Yorùbá. Interesting extensions to the any of the typical models of machine learning. It proceeds original Womb Grammar model are presented, mo- instead through automatically transforming a given (task- tivated from the specific needs of Yorùbá and sim- independent) grammar description of a source language, from ilar tonality infused languages. Our methodology just the lexicon of the target language plus a representative in- is implemented in Constraint Handling Rule Gram- put set of correct phrases in the target language. mars (CHRG) and has been used so far for inducing Both descriptions (the syntax of the source and of the tar- the grammar model of a useful subset of Yorùbá get language subset addressed) are stated in terms of linguis- noun phrases. tic constraints (also called “properties” in the linguistic liter- ature) between pairs of constituents, although for the target language constraints we depart from the classic formalism 1 Introduction [Blache, 2005] in view of Yorùbá motivated extensions. This choice allows us a great degree of modularity, in which con- Grammar induction has met with reasonable success using straints can be checked efficiently through constraint solving. different models of grammar: a) as a parametrized, genera- tive process explaining the data [Fernando C. N. Pereira and From the perspective of problem solving, our modelling Yves Schabes, 1992; Dan Klein and Christopher D. Manning, framework thus compares favourably with others: it com- 2004], b) as a probability model, so that learning a grammar bines linguistic formality with efficient problem solving, and amounts to selecting a model from a pre-specified model fam- can transfer into other languages, in particular languages that ily [Eugene Charniak and Mark Johnson, 2005; CMengqiu use tones to differentiate meaning. Wang and Noah A. Smith and Teruko Mitamura, 2007; Shay B. Cohen and Noah A. Smith, 2010], and c) as a Bayesian 2 Motivation model of machine learning [William P. Headden and Mark Johnson and David McClosky, 2009]. Close to seven thousand languages are currently spoken in the Using linguistic information from one language for the task world, the majority of which are understudied. Linguists can- of describing another language has also yielded good results, not keep up with their study even for educational purposes, albeit for specific tasks—such as disambiguating the other and there is a growing need for their automatic processing as well, since the amount of text sources grows much faster ∗ This research was supported by NSERC Discovery grant than humans can process them. To make matters worse, most 31611024 linguistic resources are poured into English and a handful of other first world languages, leaving the vast majority of lan- also produce some indication of the sentence’s degree of ac- guages and dialects under-explored. Clearly, automating the ceptability by analyzing the failed properties. discovery of an arbitrary language’s grammar model would The PG formalism presently comprises the following seven render phenomenal service to the study and preservation of categories (we adopt the handy notation of [Duchier, Dao, linguistic diversity. and Parmentier, 2013], and the same example): Scientifically, we wanted to explore to what extent the Constituency A : S, children must have categories in the set parsing-as-constraint-solving paradigm of Natural Language S Processing (NLP) problem solving could buy us a great de- gree of linguistic descriptive formality without sacrificing ef- Obligation A : 4B, at least one B child ficiency, in the realm of grammar induction and in particular Uniqueness A : B !, at most one B child for inducing Yorùbá, which is severely under-resourced and which one of the authors has expertise in. Precedence A : B ≺ C, B children precede C children Yorùbá belongs to the Yoruboid group of the Kwa branch Requirement A : B ⇒ C, if B is a child, then also C is a child of the Niger-Congo language family, which cuts across most Exclusion A : B 6⇔ C, B and C children are mutually exclu- of sub-Saharan Africa. It is a tonal dialect-continnum com- sive prising about 20 distinctive dialects and spoken by over 30 million people in the western part of Nigeria [Fagborun, Dependency A : B ∼ C, the features of B and C are the same 1994]. Example 1. For example, if we denote determiners by D, nouns by N, personal nouns by PN, verbs by V, noun phrases 3 Background by NP, and verb phrases by VP , the context free rules NP → D N and NP → N, which determine what a noun phrase is, Among the linguistic theories that lend themselves the most can be translated into the following equivalent constraints: to constraint-based implementation are those that split the in- NP : {D, N}, NP : D !, NP : 4N, NP : N !, NP : D ≺ N, D : {}, formation previously packed in one rewriting rule into sev- N : {}. eral constraints or properties. These constraint based or property-based theories, such as Property Grammars (PG) [Blache, 2005] evolved from Immediate Dominance / Lin- 4 The Intuitive Idea ear Precedence (IDLP), which unfolds a rewrite rule into the By giving up syntactic trees as a focus and focusing instead two constraints of immediate dominance (expressing which on grammar constraints, we can arrive at more modular, more categories are allowable daughters of a phrasal category) and flexible and less costly models. To see this, consider that if linear precedence (expressing which of the daughters must we were to work with tree-oriented rules such as: precede which others). For example in the PG framework, English noun phrases noun_phrase --> determiner, noun. can be described through a few constraints such as prece- such rules would fail for languages such as Yorùbá, where, dence (a determiner must precede a noun, an adjective must for instance, nouns precede determiners, and more than one precede a noun), uniqueness (there must be at most one deter- determiner can accompany a noun. For instance, ilé yı̀ı́ náà miner), exclusion (an adjective phrase must not coexist with a wó (house this the collapse) means some particular house in superlative), obligation (a noun phrase must contain the head close proximity to the speaker collapsed, whereas ilé kan náà noun), and so on. Instead of resulting in either a parse tree wó (house a the collapse) means a house the speaker may not or in failure as traditional parsing schemes do, such frame- be familiar with collapsed. works characterize a sentence through the list of the con- Transforming a tree-oriented model into a language model straints a phrase satisfies and the list of constraints it violates, for Yorùbá would require changing every rule where these so that even incorrect or incomplete phrases will be parsed. two constituents are involved, as well as creating a grammar Moreover, it is possible to relax some of the constraints by symbol that covers sequences of determiners rather than just declaring relaxation conditions in modular fashion. [Dahl and a single one. Blache, 2004] encodes the input PG into a set of CHRG rules In a constraint-based model, in contrast, the needed trans- that directly interpret the grammar in terms of satisfied or re- formation can be expressed in terms of separate constraints: laxed constraints, which are then propagated while a syntactic we need to replace the English precedence constraint with tree is built as a side effect. Womb Grammars- a recent adap- its converse (in a noun phrase, a noun must precede a de- tation of this framework into grammar transformation [Dahl terminer) 1 , and delete the constraint that in a noun phrase, and Miralles, 2012]- automates as well the induction of a lan- determiners must be unique. These modifications carry over guage’s syntax from that of another, focusing on failed con- to the entire grammar without further ado. straints. This intuition leads directly to that of our WG transforma- In such theories, the modularity obtained by splitting gram- tion methodology: first we implement an executable version matical information apart into constraints leads naturally to (i.e., a parser) of our constraint-based grammar model of En- more robust parsers, since it allows us to clearly identify from glish, then instead of running English phrases by it, we feed it the parser’s output which constraints are satisfied and which a Yorùbá input phrase. Given that we have a Yorùbá lexicon, fail, which allows us to accept even incomplete or incorrect sentences, instead of silently failing to parse them. We can 1 In fact a subtler algorithm is needed- this will be covered later. this parses into a pre-lexical string (e.g. “noun, adjective, de- T S Lcorpus T Llex Lsyntax terminer”). Obviously, many of the English constraints (in particular, for example, the English precedence constraints) will fail to hold for it. Examining such failures with respect to a representative set of input phrases in Yorùbá, our system will be able to determine whether for Yorùbá we must e.g. WG induce the converse of an English precedence constraint (if it Parser were the case that the converse ordering is found in every sin- gle Yorùbá phrase), or simply delete the English constraint (if it were the case that both orderings are allowed), or make it Violated syntax conditional to some linguistic feature being there (if in some properties cases one ordering routinely held, whereas in other cases it didn’t), etc. Model Transformation 5 Our methodology Module 5.1 Constraint modification- The WG Paradigm T T S Llex Lsyntax ? Lsyntax Just as its ancestor, Property Grammar, the WG paradigm de- T Lsyntax scribes a language’s phrases in terms of constraints between pairs of direct daughters of a phrasal category. The classic constraints are constituency (which constituents are allowable Figure 1: The Problem Figure 2: The Solution as immediate daughters of a phrasal category), precedence (in a given phrase, a certain daughter must precede a certain An Example other daughter), obligation (a certain daughter must occur in a Let LS = English and LT = Yorùbá, and let us assume that En- given phrase), uniqueness (if a daughter can occur only once glish determiers always precede the noun they modify, while in a given phrase), requirement (a certain daughter is required in Yorùbá they always post-cede it (an oversimplification, just in a given phrase), dependency (same features must be shared for illustration purposes). Thus “a red book” is correct En- between two daughters of a given phrase), and exclusion (if glish, whereas in Yorùbá we would more readily say “iwe a certain daughter occurs in a given phrase, another certain pupa kan” (book, red, a). daughter may not co-occur). If we plug the Yorùbá lexicon and the English syntax con- WG extends the parsing capabilities implicit in these prop- straints into our WG parser, and run a representative cor- erties into a model of grammatical induction, in addition to pus of (correct) Yorùbá noun phrases by the resulting hybrid parsing. In their first incarnation [Dahl and Miralles, 2012] parser, the said precedence property will be declared unsatis- they allowed greater efficiency by focusing on failed con- fied when hitting phrases such as “ı̀wé pupa kan”. The model straints only, rather than explicitly calculating all successful transformation module can then look at the entire list of un- and unsuccessful constraints. However in adapting the model satisfied constraints, and produce the missing syntactic com- into Yorùbá , we have found it more efficient to explicitly ponent of LT ’s parser by modifying the constraints in Lsyntax S calculate satisfied constraints as well, as we go along, This is so that none are violated by the corpus sentences. partly because in order to arrive at as nuanced a description as Some of the necessary modifications are easy to identify needed for Yorùbá , we had to extend the classical Property and to perform, e.g. for accepting “ı̀wé pupa kan” we only Grammar model to accommodate what we call conditional need to delete the (English) precedence requirement of deter- constraints: those whose satisfaction depends on the run-time miner over noun (noted det < n). However, subtler modifi- values of combinations of features (more on this later,) cations may be in order, after some statistical analysis in a The general WG model can be described as follows: Let LS second round of parsing: if in our LT corpus, which we have be the source language. Its syntactic component will be noted assumed representative, all determiner appear after the noun S Lsyntax . Likewise, we call the target language LT , its lexicon they modify, Yorùbá is sure to include the reverse precedence T T (Llex ) and its syntax Lsyntax . If we can get hold of a suffi- property as in English: n < det. So in this case, not only do ciently representative set of phrases in LT that are known to we need to delete det < n, but we also need to add n < det. be correct (a set where our desired subset of language is rep- resented), we can feed these to a hybrid parser consisting of 5.2 Constraint Solving as the implementation S Lsyntax T . This will result in some of the sentences be- and Llex means ing marked as incorrect by the parser. An analysis of the con- Constraint Satisfaction has yielded powerful results in many straints these “incorrect” sentences violate can subsequently AI areas, including human language processing. To a large S reveal how to transform Lsyntax so it accepts as correct the extent, this powerful problem-solving paradigm has allowed sentences in the corpus of LT —i.e., how to transform it into us to overcome the disjoin that prevailed until relatively re- T Lsyntax by modifying the constraints that were violated into cently between languages for problem description and those constraints that accept the input. Figures 1 and 2 respectively for problem solving. Since the advent of CHR [Frühwirth, show the problem and our solution in schematic form. 1998] and of its grammatical counterpart CHRG [Chris- tiansen, 2005], constraint-based linguistic formalisms can this simpagation rule is triggered when a word of category C2 materialize through fairly direct methodologies. comes before one of category C1, given the existence of the In a nutshell, CHRG rules rewrite constraints into other grammar constraint that C1 must precede C2 2 . Each of the constraints, subject to possible checks described in a rule’s properties dealt with has similar rules associated with it. guard and stated as Prolog calls. Their general format is: 5.3 Main Modifications to the Model resulting α -\ β /- γ :: > G | δ. from our Application to Yorùbá This rule is called a propagation rule. The part of the rule Conditional Properties preceding the arrow ::> is called the head, G the guard, and The original constraint based model, as we have seen, suc- δ the body; α, β, γ, δ are sequences of grammar symbols and ceeded in detecting and signalling mistakes in the sentence, constraints so that β contains at least one grammar symbol, without blocking the analysis. In this first model, which and δ contains exactly one grammar symbol which is a non- is parsing-oriented, incorrect sentences could be “accepted” terminal (and perhaps constraints); α (γ) is called left (right) through declaring some constraints as relaxable. For instance, context and β the core of the head; G is a conjunction of built- while from the context-free grammar rules shown in Section in constraints as in Constraint handling rules (CHR) and no 3 we wouldn’t be able to parse “the the book” (a common variable in G can occur in δ. If left or right context is empty, mistake from cutting and pasting in word processors), in the the corresponding marker is left out and if G is empty (inter- constraint-based formulation we can if we relax the unique- preted as true), the vertical bar is left out. The convention ness of determiner constraint. from Definite clause grammars (DCG) is adopted that con- Relaxation can be made conditional (e. g. a head noun’s re- straints (i.e., non-grammatical stuff) in head and body of a quirement for a determiner can be made relaxable in case the rule are enclosed by curly brackets. Gaps and parallel match head noun is generic and in plural form, as in “Lions sleep are not allowed in rule bodies. A gap in the rule heads is noted tonight”). The failure of relaxable constraints is signalled in “...”. Gaps are used to establish references between two long the output, but does not block the entire sentence’s analysis. distant elements. Implementations not including constraint relaxation capabil- A simplification (grammar) rule is similar to a propagation ities implicitly consider all properties as relaxable. And in rule except that the arrow is replaced by <:>. A simpaga- fact, when exploiting constraint-based parsing for grammar tion (grammar) rule is similar to a simplification rule, how- transformation rather than for parsing, this is exactly what we ever one or more grammar symbols is prefixed with ! which need, since in an unknown language any constraint may need means that the grammar symbols should not be removed from to be “relaxed” and even of course, corrected. the constraint store. Details of the CHRG encoding, how to For that reason, we have considered it more appropriate, download the CHRG system and how to use the encoding can in the context of grammar induction, to state “exceptions” as be found in the CHRG manual [Christiansen, 2010]. part of a property itself, rather than separately in a relaxation Implementing the WG paradigm through constraint solv- definition. Thus we have created conditional properties, e.g. ing allow us to both express and test linguistic constraints in “precedence(pronoun,noun)” which occurs if a pronoun pre- very modular fashion, and results in succinct while executable ceeds a noun, is plural, is a personal pronoun and can be used code. in place of both animate and inanimate nouns. In our CHRG implementation the appropriate WG con- We have also used conditional properties to express choice, straints are entered in terms of a constraint g/1, whose e.g. whereas before we could state that one element was argument stores each possible grammar property. For obligatory in a phrase, we can now state that a property instance, our English grammar hybrid parser for noun needs to have one of a list of constituents. E.g., the phrases includes the constraints: g(obligatority(noun, g(obligatority(noun, pronoun, proper noun)) constraint only pronoun, proper noun)), g(constituency(determiner)), needs to find one of noun, pronoun or proper noun to suc- g(precedence(determiner,adjective)), ceed. If none of these is found, then a failure occurs which g(unicity(determiner)), g(requirement(noun, determiner)), indicates that an obligatory head is absent. g(dependence(determiner,noun)), and so on. These proper- From Failure Driven to Success-and-Failure driven ties are weeded out upon detection of a violation by CHRG rules that look for them, e.g. an input noun phrase where In view of unprecedented efficiency where property grammar an adjective precedes a noun may provoke deletion of the related models were concerned, the original WG model cal- constraint g(precedence(noun,adjective)) when the following culated only failed properties, on the assumption that these CHRG rule applies: could be trusted to be a complement of the successful prop- erties, thus obviating the need to calculate both explicitly. !word(C2,F1,_),...,!word(C1,F2,_), However our more in depth analysis in the context of Yorùbá {g(precedence(C1,C2))}<:> has, as we have seen, uncovered the need for more nuances {fail(precedence(C1,C2), F1, F2)}. than simply failing or succeeding, as in the case of condi- tional properties. We therefore now use a success driven and Each word is stored in a CHRG symbol word/3, along 2 Recall that in CHRG syntax the symbols prefixed with excla- with its category and features (i.e. word(noun,[neutral, n/a, mation points are kept, while the ones without are replaced by the common, inanimate],aso)). Since the CHRG parse predicate body of the rule, in this case an update constraint that invokes some stores and abstracts the position of each word in the sentence, housekeeping procedures failure driven approach for inducing the grammar of our tar- two syllables with a high and low tone on each syllable re- get language, Yorùbá. Each input phrase of the target lan- spectively is stored as “high, mid” in the features of the word. guage is tested with all relevant constraints for both failure For instance: and success. This makes the model slightly less efficient than [tútù]::> word(ad jective, [neutral, n/a, descriptive, both], if we only were to calculate failed properties, but of course tútù). the gain is in accuracy. Efficiency is still guaranteed by the These tones are used to infer conditional properties in the normal CHRG way of operating: rules will only trigger when second phase of parsing, so that if a property is found to occur relevant, e.g. if a phrase is comprised of only a noun and around words with certain tones, this property will be condi- an adjective, it will not be tested with for instance prece- tioned on the presence of such tones. It is important to note dence(pronoun, determiner) or any other constraint whose that the tones are not used in isolation of other features and categories are different from those of the input phrase. We a property will be said to be conditioned on tones if, tones keep a list of all properties that fail and another for those that are the only unique features present. The addition of tones succeed together with the features of the categories of each makes our approach extensible to many other languages in- input phrase and their counts. It is important to state that cluding tone languages. constituency constraints are tested only for success. This is In addition to the afore mentioned, the system provides a because we are interested in checking that our target gram- record of the English translation of every word in the input mar shares similar constituents with our source language and phrases and this record is consulted during parsing to produce testing for failure will be irrelevant for these constraints. the English equivalent for each word in the input phrase. Model Transformation Module 5.4 Sample Output After the first round of parsing, we do a statistical analy- Here is a subset of our induced Yorùbá grammar: sis in a second round of parsing where we determine which constraints should be induced as a property or a conditional -precedence(pronoun,noun):- property of our target language and those which need to be pronoun preceeds noun if the pronoun is weeded out. First we determine which constraints should plural, personal and can be used in place of be added to the grammar by finding all constraints that suc- both animate and inanimate nouns ceed in all relevant phrases. We are correct in doing so since, our input phrase is correct and representative of the -precedence(adjective,noun):- target language. We also add the converse of any prece- adjective preceeds noun if the adjective is dence rule that fails in all relevant phrases so that for in- plural, quantity-uncountable and can be used with stance, if precedence(pronoun, determiner) fails in all rele- both animate and inanimate nouns vant phrases, precedence(determiner, pronoun) is induced as a property of our target language. Next, we decide which -obligatority((noun;proper-noun;pronoun)):- constraints should be conditional properties. All constraints noun;proper-noun;pronoun are obligatority. that fail and succeed are tested in this phase. We find these It succeeds in all 46 input phrases. properties by searching for features which are unique to the noun succeeds in 37 relevant phrases; failure and or success of these constraints. Any constraint pronoun succeeds in 5 relevant phrases; where such features are found are added to the grammar as proper-noun succeeds in 4 relevant phrases; a conditional property, with its conditions i.e features clearly stated. All other constraints which are not induced by the -precedence(pronoun,adjective):- above rules are not added to the grammar of Yorùbá. pronoun preceeds adjective in all 4 relevant phrases Inducing Constraints not present in the Original Language -precedence(quantifier,adjective):- We also added an additional function that finds all precedence quantifier preceeds adjective in all 4 relevant constraints that are absent in our source but present in our tar- phrases get language. We are equally able to test these constraints as above by checking if such a constraint succeeds in all relevant -precedence(quantifier,pronoun):- phrases and if the converse succeeds in any input phrase. If quantifier preceeds pronoun in all 4 relevant we find that the converse does not succeed in any of the input phrases phrases, this constraint is induced as a property of the target language, else we search for a unique feature as above. -precedence(pronoun,determiner):- pronoun preceeds determiner in all 7 relevant Using Tonality to Induce Conditional Properties phrases Yorùbá is a tone language, therefore it was imperative to en- sure that tones are properly represented. We adopt a full tone -precedence(proper-noun,adjective):- marking approach so that each word is marked with their proper-noun preceeds adjective in all 3 relevant tones in order to avoid ambiguities. The tones are also stored phrases as features, so that for instance the word tútù(cold) that has -precedence(proper-noun,determiner):- given that some users’ intuitions about what is correct in proper-noun preceeds determiner in all 3 relevant their language (and indeed, about even whether they use phrases. it “correctly” or not) may not be accurate. -pronoun is a constituent of noun phrases. It 8 Concluding Remarks occurs in 12 phrases. We have presented an implemented model for integrating -adjective is a constituent of noun phrases. It grammatical knowledge representation and reasoning around occurs in 26 phrases. the CHRG constraint paradigm of problem solving. Our model extends the original grammar induction formalism, WG, in ways motivated by the needs of our particular ap- The first two properties are conditional properties and the fea- plication: the automatic transformation of an English model tures of these properties are explictly stated. The obligatority of syntax into that of Yorùbá. constraint which represents the head of phrases consists of We have also presented a proof-of-concept system that al- nouns, proper-nouns and pronouns. We provide further in- lows users to input a source language’s grammar’s description formation on the number of times each of these categories in modular and declarative fashion, in terms of the linguis- occur as the head of the phrase. We do not output the features tic constraints that are supported (constituency, precedence, for other properties except the conditional properties because dependency, obligatority, requirement, exclusion, unicity). they already succeed in all relevant phrases. Since such descriptions also stand alone as a purely linguis- tic model of a language’s syntax, our system can cater even 6 How can our model be acquired? for users who are not conversant with coding, such as pure Our model for grammar induction is in a sense, already ac- linguists wanting to perform syntactic model transformation quired by many other possible source-target language pairs experiments aided by computers. Their purely (constraint- (those that are amenable to description through the linguis- based) linguistic descriptions of syntax automatically turn tic system of constraints that we accept), because rather into executable code when appended to our own code. than developing a language-dependent implementation of our As we have seen, our system automatically transforms a model, we have implemented a generator of grammar induc- user’s syntactic description of a source language into that of ers which, to work with a different pair of languages than a target language, of which only the lexicon and a set of rep- the one we’ve chosen, requires only: a) a representative sam- resentative sample phrases are known. While demonstrated ple of target language phrases, b) that the source grammar specifically for English as source language and Yorùbá as tar- be expressed in terms of the linguistic constraints accepted get language, our implementation can accept any other pair of in our model (namely, obligatority, precedence, constituency, languages for inducing the syntactic constraints of one from requirement, dependency, uniqueness and exclusion), and c) that of the other, as long as their description can be done in that the target grammar’s lexicon be described in our user- terms of the supported constraints. friendly provided plain text format, for the subset of words We have thus contributed a flexible modelling frame- used in the input set of target language phrases. work solidly grounded in linguistic knowledge representation which, through the magic of constraint solving, turns into an 7 Verification and validation efficiently executable problem solving tool. Through its in- Verifying and validating that our results are correct comports herent modularity, our model can support revision and adap- two tasks: tation quite straightforwardly: new constraints between im- mediate daughters of any phrase can be modularly added and • checking that the input and the output syntax descrip- implemented, without touching the rest of the system. tions are correct. Since these descriptions are done in Also, our model can accommodate solution revisions at terms of the linguistic Property grammar formalism, it execution time by interacting with a linguist whenever con- may be challenging for users unfamiliar with it to be sure straints can not be induced consistently. Visual representa- that it covers exactly the subset of language addressed. tions of the output that would be helpful for linguists in this However there is a mechanical procedure to transform respect have been proposed in [Adebara I., Dahl V., 2015] context-free grammars into property grammars [Philippe and is yet to be implemented. Blache, Stephane Rauzy, 2012], so if the user is more fa- Other than integrating knowledge-based and efficiency- miliar with context-free grammars, and the subset of lan- based paradigms of problem solving, our approach demon- guage allows it, we can at least start from a known, cor- strates that constraint programming is not just about solv- rect Context Free (CF) version and apply the said proce- ing classic OR problems. It is also a powerful tool for dure, either by hand or mechanically, in order to arrive at addressing interdisciplinary applications [Dahl, Miralles, and an equivalent formulation in terms of our linguistic con- Becerra, 2012; Becerra, Dahl, and Miralles, 2013; Becerra, straints. Alternatively, we can consult a linguist, if one Dahl, and Jiménez-López, 2014; Adebara, Dahl, and Tes- is available who specializes in the involved languages. saris, 2015; Adebara I., Dahl V., 2015]. • checking that the set of input phrases from the target lan- Further interesting ramifications of our work would in- guage is correct and representative. This requires at the clude: testing our system for a larger coverage of syntax (we very least a native speaker, and if possible also a linguist, have only addressed noun phrases so far), testing it for other tonal-sensitive languages, studying how assimilation influ- Cohen S.B. and Smith N.A. 2010. Covariance in Unsuper- ences language structure especially in tone languages, study- vised Learning of Probabilistic Grammars. Journal of ing whether any further constraints or approach extensions Machine Learning Research11:3017–3051 would be needed to accommodate families of languages not Dahl, V., Blache, P. 2004. Directly Executable Constraint easily describable within our approach (e.g. languages which Based Grammars. In Proc. Journees Francophones de have different categories from the source language, those who Programmation en Logique avec Contraintes. have different inflectional paradigms from the source lan- guage, those that exhibit constraints not among our allowable Dahl, V., and Miralles, J. E. 2012. Womb grammars: Con- set of constraints). straint solving for grammar induction. Sneyers, J., and Frühwirth, T., eds., In Proceedings of the 9th Workshop Acknowledgments on Constraint Handling Rules, volume Technical Report CW 624, 32–40. This research was supported by V. Dahl’s NSERC Discovery grant 31611024. Dahl, V.; Miralles, E.; and Becerra, L. 2012. On language ac- quisition through womb grammars. In 7th International Workshop on Constraint Solving and Language Process- References ing, 99–105. Adebara I., Dahl V. 2015. Domes as a Prodigal Shape in Synthesis-Enhanced Parsers. In SHAPES 3.0, Con- Duchier, D.; Dao, T.-B.-H.; and Parmentier, Y. 2013. Model- straints and Logic Programming (CSLP’04). Theory and Implementation of Property Grammars with Features. Journal of Logic and Computation 19. Adebara I., Dahl V. 2015. Shape Analysis as an Aid for Grammar Inductions. In SHAPES 3.0, Constraints and Fagborun, O. 1994 The Yor‘uba Koine - its history and lin- Logic Programming (CSLP’04). guistic innovations. Munchen : LINCOM EUROPA. Adebara, I.; Dahl, V.; and Tessaris, S. 2015. Complet- Frühwirth, T. W. 1998. Theory and practice of constraint han- ing mixed language grammars through womb grammars dling rules. Journal. Logic. Programming. 37(1-3):95– plus ontologies. Henning Christiansen, M. D. J. L., and 138. Loukanova, R., eds., Proceedings of the International Klein D. and Manning C.D. 2004. Corpus-Based Induction Workshop on Partiality, Underspecification and Natural of Syntactic Structure: Models of Dependency and Con- Language Processing, 32–40. stituency. In Association of Computational Linguistics Becerra, L.; Dahl, V.; and Miralles, E. 2013. On second lan- (ACL). 478–485. guage tutoring through womb grammars. 12th Interna- Headden W.P. and Johnson M. and McClosky D. 2009. Im- tional Work-Conference on Artificial Neural Networks proving Unsupervised Dependency Parsing with Richer (IWANN) 2013, June 12-14, Tenerife, Spain. Contexts and Smoothing. North American Chapter of Becerra, L.; Dahl, V.; and Jiménez-López, M. D. 2014. the Association for Computational Linguistics: Human Womb grammars as a bio-inspired model for grammar Language Technologies (HLT-NAACL). 101–109. induction. In Trends in Practical Applications of Hetero- Nicolas L. and Molinero M.A., Sagot B., Trigo E., de La geneous Multi-Agent Systems. The PAAMS Collection. Clergerie, Éric, Farré J. and Vergés J.M. 2009. Towards Springer International Publishing. 79–86. efficient production of linguistic resources: the Victoria Blache, P. 2005. Property grammars: A fully constraint-based Project. theory. In Proceedings of the First International Confer- Pereira F.C.N and Schabes Y. 1992. Inside-Outside Reesti- ence on Constraint Solving and Language Processing, mation from Partially Bracketed Corpora. Association CSLP’04, 1–16. Berlin, Heidelberg: Springer-Verlag. of Computational Linguistics (ACL). 128–135 Blache P., Rauzy S. 2012. Hybridization and Treebank En- Wang M. and Smith N. A. and Mitamura T. 2007. What is richment with Constraint-Based Representations. Pro- the Jeopardy Model? A Quasi-Synchronous Grammar ceedings of International Conference on Language Re- for QA. Conference on Empirical Methods in Natural sources and Evaluation (LREC). Language Processing and Computational Natural Lan- Burkett D. and Klein D. 2008. Two Languages are Better guage Learning (EMNLP-CoNLL). 22–32. than One (for Syntactic Parsing). In Empirical Methods for Natural Language Processing (EMNLP). 877–886. Charniak E. and Johnson M. 2005. Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking. In As- sociation for Computational Linguistics (ACL). Christiansen, H. 2005. CHR grammars. Theory and Practice of Logic Programming (TPLP) 5(4-5):467–501. Christiansen, H. 2010. CHRG manual. http://akira.ruc.dk/ henning/chrg/CHRGusers- Guide.html.