<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Grammar Assistance Using Syntactic Structures (GAUSS)</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Olga Zamaraeva</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lorena S. Allegue</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Carlos Gómez-Rodríguez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Margarita Alonso-Ramos</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anastasiia Ogneva</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Universidade da Coruña, CITIC, Department of Computer Science and Information Technologies.</institution>
          <addr-line>15071 A Coruña</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Universidade da Coruña, CITIC, Department of Humanities (“Letras”).</institution>
          <addr-line>15071 A Coruña</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Universidade de Santiago de Compostela, Department of Developmental Psychology</institution>
          ,
          <addr-line>15782 Santiago de Compostela</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Automatic grammar coaching serves an important purpose of advising on standard grammar varieties while not imposing social pressures or reinforcing established social roles. Such systems already exist but most of them are for English and few of them ofer meaningful feedback. Furthermore, they typically rely completely on neural methods and require huge computational resources which most of the world cannot aford. We propose a grammar coaching system for Spanish that relies on (i) a rich linguistic formalism capable of giving informative feedback; and (ii) a faster parsing algorithm which makes using this formalism practical in a real-world application. The approach is feasible for any language for which there is a computerized grammar and is less reliant on expensive and environmentally costly neural methods. We seek to contribute to Greener AI and to address global education challenges by raising the standards of inclusivity and engagement in grammar coaching.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;grammar engineering</kwd>
        <kwd>grammar coaching</kwd>
        <kwd>second language acquisition</kwd>
        <kwd>HPSG</kwd>
        <kwd>syntactic theory</kwd>
        <kwd>syntax</kwd>
        <kwd>parsing</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>S
[︃PERNUM</p>
      <p>GEN
f3epml]︃⎤⎦⟩ ⎥⎥⎥⎦⎥⎢⎢⎢⎢⎢RELS
⎤⎡adj-fem-pl
[︃PERNUM</p>
    </sec>
    <sec id="sec-2">
      <title>2. State of the art at the start of the project</title>
      <p>Most grammar coaching systems available today are
purely statistical and do not use explicit linguistic
knowledge. Based on purely statistical methods and lacking
interpretability, they “guess” based on the context and
are not aware of concepts like agreement. Their feedback
is divorced from the methodology of suggesting a better
sentence, opening possibilities for wrong feedback. Such
systems are often only available for English, because their
neural architectures require huge quantities of training
data. Such systems are also ecologically problematic[1]. the SRG.
of deploying the grammar on some data. The grammar
itself contains the types, not the instances. The types are
instantiated through interfacing with the lexicon and, in
some cases, an external morphophonological analyzer.</p>
      <p>The HPSG theory covers many syntactic phenomena
and has been developed and tested using a variety of data
from a variety of languages. One of the approaches to
the empirical testing of this theory is implementing it on
the computer and then automatically parsing data and
inspecting the results for correctness and consistency.
Eforts of this kind include ParGram [ 6], CoreGram [7]
and DELPH-IN [8, 9] It is this approach that gave rise to</p>
      <sec id="sec-2-1">
        <title>3.2. DELPH-IN Consortium</title>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <sec id="sec-3-1">
        <title>The DELPH-IN research consortium is an international</title>
        <p>The GAUSS project is the result of the collaboration be- efort for grammar engineering using HPSG: Deep
Lintween research areas such as CS, NLP, theoretical linguis- guistic Processing with HPSG Initiative. It is committed
tics, and applied linguistics. The intersectional nature of to using a particular version of the HPSG formalism that
the project is realized by the combination of NLP tech- was defined originally in [ 8]. The consortium develops
niques and theoretically formalized grammars. In partic- tools such as parsers, including the parser we used in
ular, the project relies on the Spanish Resource Grammar this project, the ACE parser [10]. Another set of relevant
[SRG; 2, 3, 4], a grammar of Spanish implemented in tools includes the software for automatic profiling of
the Head-driven Phrase Structure Grammar formalism test data known as incr tsdb() (pronounced ‘tsdb++’)
(HPSG). [11, 12] and a related tool “full-forest treebanker” (ftb)
[13]. These tools allow us to inspect diferences between
3.1. HPSG syntax theory diferent grammar versions systematically.
Grammars are tested on sentences automatically, using
Head-driven Phrase Structure Grammar [HPSG; 5] is a a parser. The first time a grammar is run on a sentence,
constraint unification theory of syntax. A sentence is an expert must verify the correctness of the output. Often
analyzed as a structure where parts can be constrained it makes sense to do this by looking at the semantic
(deto be identical to each other. For example, a verb’s agree- pendency) structure; we can assume that if the semantics
ment values (e.g. third person) can be constrained to be is correct, then the syntactic structure that corresponds
identical to the agreement values of the subject of the to it is adequate. The semantics in DELPH-IN grammars
verb. Similarly, adjectives can be constrained with re- is modeled with Minimal Recursion Semantics formalism
spect to the agreement values of the noun they modify, [MRS; 14]. An MRS structure is a bag of predications
as shown in Figure 2. Crucially, ungrammatical strings encoding dependencies as well as modifier and
negaof words will violate the constraints required for well- tion scope, information structure, and more. It can be
formed structures and as such will not be covered by an automatically converted to a dependency structure
familHPSG grammar. iar to natural language processing (NLP) practitioners</p>
        <p>Structures like the ones in Figure 2 are instances of (Figure 3). When the parser analyzes a sentence
accordmore general types and can be seen in the specific results ing to the grammar, the resulting structure includes an</p>
        <p>
          MRS, the adequacy of which is easy to establish manu- (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ).
ally (whether the meaning of the sentence is the intended
one). Adequacy of obtained analyses on corpora serve as
accumulating evidence for the validity of the theory of
syntax.
        </p>
        <sec id="sec-3-1-1">
          <title>3.3. Spanish Resource Grammar</title>
          <p>
            (
            <xref ref-type="bibr" rid="ref1">1</xref>
            ) *Mis abuelos son
my.3pl grandparent.masc.pl be.3pl.pres.ind
personas famosos.
person.fem.3pl famous.masc.pl
Intended: ‘My grandparents are famous people.’
[spa; Yamada et al. 18]
          </p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>At the core of the project’s methodology is the digital rep</title>
        <p>resentation of the Spanish syntax, the Spanish Resource The grammar will detect such learner structures
usGrammar [2, 3, 4]. The SRG consists of 54,510 lemmas ing what is called ‘mal-rules’ [19], a technical term for
in the lexicon, 543 lexical types to instantiate those lem- HPSG types designed specifically to cover productions
mas, 504 lexical rule types serving morphophonological characteristic of learners. For example, the grammar will
analysis, and 226 phrasal types. It is the second largest have to have a way to ignore the incompatible agreement
DELPH-IN grammar (after the English Resource Gram- values in Figure 4.
mar [15, 16]). SRG was first developed prior to the ACE We achieve this by only a small set of modifications to
parser and one of the objectives of the GAUSS project the grammar. We use the interface of the grammar with
ended up being the complete reimplementation of the the external morphophonological analyzer to recognize
SRG morphophonological interface. The outcome is that any noun or adjective as potentially belonging to either
the SRG can now be used with the ACE parser [4]. As gender (this requires 40 short additional entries in the
before, it relies on an external morphophonological ana- lexical rule section of the grammar, one corresponding
lyzer Freeling [17]. to each possible Freeling noun or adjective tag). We
as</p>
        <p>One major outcome of this is that we could reparse sociate each such lexical rule with a special LEARNER
the portions of the AnCora corpus previously released feature, so that ultimately any sentence that uses one or
as the TIBIDABO treebank [3]. The previously released more of such rules can be detected as a learner
producversion was partially verified for the correctness of the tion. No changes in the syntax part of the grammar are
structure but the accuracy figures corresponding to that required, in principle. However, deploying the grammar
verification were never reported (as far as we can tell). on the learner sentences without modifications revealed
One of the outcomes of GAUSS is the re-parserd, re- a number of overgeneration issues in the original
gramverified, and re-released portions of TIBIDABO (currently mar, which we were able to fix thanks to this experiment.
2291 sentences) [4]. The updated version of the SRG Overgeneration is when a grammar covers an
ungramalong with the verified treebanks are open-source and matical sentence or produces a nonsensical structure for
are released on GitHub: https://github.com/delph-in/srg a sentence along with the correct one(s). When we saw
instances of the original grammar covering learner
pro3.4. Using the SRG with learner data ductions, we investigated such cases and have found 4
syntactic types (so far) which were underconstrained
The main idea behind the GAUSS project is that we can with respect to the agreement values. We have added
use the SRG to model constructions characteristic of the missing agreement constraints, which resulted in
relearners of Spanish (as opposed to native speakers). We duced overgeneration and ambiguity of the SRG with
create a version of the SRG that is modified specifically to respect to the TIBIDABO treebank. In this way, modeling
cover learner constructions, starting with gender agree- learner constructions helped us improve the analysis of
ment constructions, like the one illustrated in example agreement in the original SRG.
⎢⎢STEM[︁personas]︁
⎢
⎢⎢ ⟨ ⎡
⎢⎢RELS ⎣PNG
⎣
[︃PERNUM</p>
        <p>GEN
[︃PERNUM</p>
        <p>GEN
⎤
⎥
⎥
⎥
3pl ]︃⎤⟩ ⎥⎥</p>
        <p>⎥
masc ⎦ ⎥⎥
⎥
⎥
⎥
⎦</p>
        <p>
          After all the necessary mal-rules are implemented, the items in the treebanks, we can attempt to train a neural
plan is to (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) accompany each model of a learner con- supertagger for Spanish once again.
struction with meaningful feedback; and (2) deploy the
grammar as a web-based service such that it can be tested
by learners of Spanish. This is work in progress. 4. Planning and Team
        </p>
        <sec id="sec-3-2-1">
          <title>3.5. Parsing speed bottleneck</title>
          <p>
            The main challenge in HPSG parsing speed is that large
feature structures combinatorically lead to huge search
space. As a result, HPSG parsing is comparatively slow
in practice. For example, the ACE parser takes about 3
seconds per sentence on average on a corpus of 100K
sentences (some of these sentences take minutes while
others take less than a second) [20]. The GAUSS project
attempts to address this challenge by a combination of
methodologies: (
            <xref ref-type="bibr" rid="ref1">1</xref>
            ) improving analyses in the grammar
to reduce meaningless ambiguity (overgeneration) and
thus reduce the size of the parse chart; (2) integrating
top-down parsing, and (3) filtering lexical entries and
grammar rules so that fewer rules are considered at each
step. Method (
            <xref ref-type="bibr" rid="ref1">1</xref>
            ) is what we employed while addressing
overgeneration we discovered by deploying the grammar
on the learner corpus. We have managed to improve
the SRG’s performance up to 60% on sentences of length
8-10. Method (2) has been underexplored in HPSG but
has seen a rekindled interest recently [21]. HPSG parers
are overwhelmingly bottom-up but for long sentences, a
lot can be learned immediately from the start of the
sentence/top of the syntax tree, discarding many irrelevant
search paths. Method (3) includes developing a neural
supertagger (filter) for HPSG. The supertagger will reduce
the number of possibilities the parser needs to explore
by discarding unlikely word meanings. Statistical
filtering was successfully applied to HPSG [22], and we are
now researching how neural methods can improve the
SOTA. We start with applying method (3) to the English
Resource Grammar treebanks and obtain a speed-up of a
factor of three compared to the baseline. However, when
we attempted the method on the Spanish treebanks, the
results were not yet satisfactory, apparently because the
Spanish treebanks were not big enough at the start of
the GAUSS project. Now that we added more verified
          </p>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>The GAUSS project consists of three Research Objective (RO) and four Work Packages (WP). They are summarized in Table 1.</title>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgments</title>
      <sec id="sec-4-1">
        <title>The GAUSS project is funded by the European Union’s</title>
        <p>Horizon Europe Framework Programme under the
Marie Skłodowska-Curie postdoctoral fellowship grant
HORIZON-MSCA-2021-PF-01 (GAUSS, grant agreement
No 101063104) The project is carried out in the Language
and Society Information research group (LyS) of
Universidade da Coruña.
[2] M. Marimon, The Spanish Resource Grammar, in: A. Carando, K. Sagae, C. Sánchez-Gutiérrez,
COWS</p>
        <p>LREC, 2010. L2H: A corpus of Spanish learner writing, Research
[3] M. Marimon, N. Bel, L. Padró, Automatic selection in Corpus Linguistics 8 (2020) 17–32.
of HPSG-parsed sentences for treebank construc- [19] D. Schneider, K. McCoy, Recognizing syntactic
tion, Computational Linguistics 40 (2014) 523–531. errors in the writing of second language learners,
[4] O. Zamaraeva, L. S. Allegue, C. Gómez-Rodríguez, in: ACL, 1998, pp. 1198–1204.</p>
        <p>Spanish Resource Grammar version 2023, in: [20] O. Zamaraeva, C. Gómez-Rodríguez, Revisiting
suCOLING-2024, in press. pertagging for HPSG, 2023. arXiv:2309.07590.
[5] C. Pollard, I. Sag, Head-Driven Phrase Structure [21] L. Chiruzzo, D. Wonsever, Statistical deep parsing</p>
        <p>Grammar, CSLI, 1994. for spanish using neural networks, in: IWPT, 2020,
[6] M. Butt, T. H. King, Urdu and the parallel gram- pp. 132–144.</p>
        <p>mar project, in: Proceedings of the 3rd work- [22] R. Dridan, Ubertagging: Joint segmentation and
shop on Asian language resources and international supertagging for english, in: EMNLP, 2013, pp.
standardization-Volume 12, Association for Com- 1201–1212.</p>
        <p>putational Linguistics, 2002, pp. 1–3.
[7] S. Müller, The CoreGram project: Theoretical
linguistics, theory development and verification,
Journal of Language Modelling 3 (2015) 21–86.
[8] A. Copestake, Appendix: Definitions of typed
feature structures, Natural Language Engineering 6
(2000) 109–112.
[9] E. M. Bender, G. Emerson, Computational
linguistics and grammar engineering, in: S. Müller,
A. Abeillé, R. D. Borsley, J.-P. Koenig (Eds.),
HeadDriven Phrase Structure Grammar: The handbook,
2021.
[10] B. Crysmann, W. Packard, Towards eficient HPSG
generation for German, a non-configurational
language., in: COLING, 2012, pp. 695–710.
[11] S. Oepen, D. Flickinger, Towards systematic
grammar profiling. test suite technology 10 years after,</p>
        <p>Computer Speech &amp; Language 12 (1998) 411–435.
[12] S. Oepen, [incr tsdb ()] competence and
perfor</p>
        <p>mance laboratory. user and reference manual, 1999.
[13] W. Packard, UW-MRS: Leveraging a deep grammar
for robotic spatial commands, SemEval 2014 (2014)
812.
[14] A. Copestake, D. Flickinger, C. Pollard, I. A. Sag,</p>
        <p>Minimal recursion semantics: An introduction,
Research on language and computation 3 (2005) 281–
332.
[15] D. Flickinger, On building a more eficient grammar
by exploiting types, Natural Language Engineering
6 (2000) 15–28.
[16] D. Flickinger, Accuracy v. robustness in grammar
engineering, in: E. M. Bender, J. E. Arnold (Eds.),
Language from a Cognitive Perspective: Grammar,
Usage and Processing, CSLI, Stanford, CA, 2011, pp.</p>
        <p>31–50.
[17] X. Carreras, I. Chao, L. Padró, M. Padró, Freeling:</p>
        <p>An open-source suite of language analyzers, in:
Proceedings of the Fourth International Conference
on Language Resources and Evaluation (LREC’04),
2004.
[18] A. Yamada, S. Davidson, P. Fernández-Mira,</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R.</given-names>
            <surname>Schwartz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dodge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. A.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Etzioni</surname>
          </string-name>
          ,
          <string-name>
            <surname>Green</surname>
            <given-names>AI</given-names>
          </string-name>
          , ACM
          <volume>63</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>