<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An Interaction Grammar for English Verbs</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Shohreh Tabatabayi Seifi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>INRIA / Universit ́e de Lorraine</institution>
          ,
          <addr-line>BP 239 54506 Vandoeuvre-les-Nancy cedex</addr-line>
        </aff>
      </contrib-group>
      <fpage>160</fpage>
      <lpage>169</lpage>
      <abstract>
        <p>This paper accounts for the construction of a grammar for English verbs using Interaction Grammars. Interaction Grammar is a grammatical formalism based on the two key notions: polarities and constraint system. A polarity expresses an available resource or a lack of resource and is used to discriminate between saturated and unsaturated syntactic structures. A grammar is viewed as a system of constraints of different kinds: structural, feature and polarity constraints, which should be satisfied by the parse trees of sentences. We have developed a grammar for English verbs in affirmative clauses and finally we evaluated our grammar on the portion of a test suite of sentences, the English TSNLP, with LEOPAR parser which is a parser devoted to Interaction Grammars.</p>
      </abstract>
      <kwd-group>
        <kwd>Grammatical formalism</kwd>
        <kwd>interaction grammar</kwd>
        <kwd>tree description</kwd>
        <kwd>polarity</kwd>
        <kwd>unification grammar</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>Interaction Grammars</title>
      <p>In order to get a view of how Interaction Grammars work, the following notions are
required.
2.1</p>
      <sec id="sec-2-1">
        <title>Tree Description</title>
        <p>
          A tree description is a tree like structure which uses the notion of underspecification
relations to describe a family of trees instead of only one tree [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. This allows to make
an unlimited number of trees from a unique underspecified tree representation. A tree
which does not have any underspecified relation is a model. For instance, one tree
description with an underspecified dominance link, illustrated in the left side of Fig. 1,
can produce three (and more) different models.
IG associates features with the nodes of tree descriptions to put constraints and prevent
building ungrammatical parse trees. Feature structure models the internal structure of
the language where grammatical properties are represented. Not only do feature
structures prevent building ungrammatical sentences, but also they carry valuable linguistics
information. Using feature structure to construct a grammar is a salient characteristics
of a formalism like LFG but unlike LFG there is no recursive feature in IG and feature
structures have only one level.
Polarities are one of the basic notions in Interaction Grammars. A polarized feature is
used to control the process of merging nodes of trees to be combined. Different kinds
of polarities are used in IG. There are features with positive or negative polarities
meaning an available resource to be consumed or an expected resource to be provided
respectively. A neutral feature indicates a feature which is not to be consumed but
just to participate in a simple unification process while combining with other nodes. A
saturated feature is the result of combination of a positive and a negative feature and
can be unified with no more positive or negative feature.
        </p>
        <p>Finally there are virtual features which should be merged with real features during
the parsing process including positive, negative, neutral and saturated features. Virtual
polarities are used to express required contexts. Positive, negative and virtual features
constitute the active polarities because they need to combine with other polarities.
In Fig. 2 features with positive (–&gt;), negative (&lt;–) and virtual (∼) polarities can be
seen in the tree descriptions associated with the words been and arranged. Saturated
features indicated with symbol &lt;==&gt; and neutral features indicated with symbol = or
== can be seen in the non-completed parse tree in the left side of Fig. 2. The horizontal
arrows between nodes indicate linear precedence constraints and comes in two forms:
large precedence (- - &gt;)which means there can be several nodes between these two
node and immediate precedence (–&gt;) which obstacles locating any other node between
these two nodes.</p>
        <p>Tree descriptions along with polarized feature structure are called polarized tree
descriptions or PTDs. The composition process is defined as a sequence of PTD
superpositions controlled by polarities. When two nodes merge, their features will be unified
according to the standard process of unification with the extra limitations coming from
the rules of combination for polarities. Saturated trees are those completely specified
trees which have no active polarities anymore. The goal of parsing is to generate all
saturated trees from the set of input PTDs which come from a particular IG grammar.
2.4</p>
      </sec>
      <sec id="sec-2-2">
        <title>Grammar</title>
        <p>An IG grammar is defined by a finite set of PTDs which are called the elementary
PTDs or EPTDs of the grammar. Each EPTD has one (or more) leaf which is linked
with a word of a lexicon and is called an anchored node. This means that the grammar
is strictly lexicalized and there is no EPTD without any anchor. In practice there is
more than one EPTD for each lexical entry in the grammar with respect to its different
syntactic properties.</p>
        <p>The parser of an IG grammar has two main roles. First to select the EPTDs of the
words in the input sentence from the set of EPTDs inside the grammar then to build
all valid minimal models with the use of these EPTDs.</p>
        <p>Fig. 2 and Fig. 3 show an example of building a model for the sentence The interview
has been arranged. The first figure is a snapshot of the intermediate results during
parsing process; a non-completed parse tree with unsaturated nodes along with the
EPTDs of the rest of the words of the sentence. The second figure shows the result
yielded by combining the EPTDs of the remaining words with the non-completed tree:
it is one valid model, the parse tree of the input sentence.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Building grammars with wide coverage</title>
      <p>
        Tackling with real language problems needs a large scale lexicalized grammar which
is hard to build and without using automatic tools it is an overwhelming task. The
main reason for that is the huge degree of grammatical and lexical redundancy. For
instance in IG, several PTDs may share same subtrees which is due to the grammatical
redundancy of syntactic structures. Moreover different lexical entries share the same
elementary PTDs owing to the fact that they are in the same syntactic category. These
redundancies turn even a small modification of the grammar into a big change in large
amount of grammar trees. In order to conquer these obstacles eXtensible MetaGrammar
(XMG) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] is used as a facilitating tool to write the IG grammar. XMG is a tool for
constructing large scale grammars. The main feature of XMG is to distinguish between
source grammar and object grammar. A source grammar is a collection of statements
written in a human readable language which produces object grammars which are
usable by NLP systems.
3.1
      </p>
      <sec id="sec-3-1">
        <title>Source grammar and object grammar</title>
        <p>
          The terms source grammar and object grammar here are analogous with the source
and object codes in programming languages. In the current task first we wrote the
source grammar and then we compiled it with XMG into the object grammar which is
encoded in XML. XMG is widely used in construction of grammars in two formalisms:
IG and TAG [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
        </p>
        <p>In source grammar, each small fragment of tree structure is written in an individual
class.</p>
        <p>By use of class disjunction, conjunction and inheritance, more complex classes can
be built. The compilation of terminal classes of the source grammar produces the
EPTDs of the object grammar. Each EPTD does not contain any lexical entry yet, but
has a detailed description of the word which is going to be nested in its anchor node.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Linking the grammar with the lexicon</title>
        <p>The characteristics of the appropriate word to be settled in an anchor node of an
EPTD is described in the EPTD interface. An interface is a feature structure describing
the words which are able to anchor the EPTD. Every entry in the lexicon is a pair of a
word and a feature structure hence a word can anchor an EPTD if its feature structure
is unifiable with the EPTD’s interface in the exact same means of unification we have
in node merging.</p>
        <p>The whole process chain is illustrated in Fig. 4 and the highlighted box is the part
that has been developed in our work.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>The Grammar of English Verbs</title>
      <p>
        To build a complete Interaction Grammar for a specific language which can be able to
parse any possible sentence, we need a set of EPTDs for wide range of words from
different categories like verb, auxiliary, noun, adjective, adverb, prepositions, determiner,
pronoun etc. Our main effort was on the construction of a grammar for verbs in
affirmative clauses (e.g. different tenses, moods and voices). We have used a small grammar
for noun phrase, pronouns, prepositions and adjectives to provide appropriate EPTDs
to parse a sentence [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>The central focus in writing a grammar with IG formalism is to find a way to write
classes in a manner that with the use of heredity, conjunction and disjunction all
different grammar trees needed in a complete grammar can be obtainable.
4.1</p>
      <sec id="sec-4-1">
        <title>An Sketch of The Major Modules</title>
        <p>Five major modules contribute building the EPTDs for verbs. Module VerbalKernel
is the kernel of all lexical and auxiliary verbs. It implements different possible voices and
moods of the verbs and also it provides syntactic functions of a verb phrase (e.g. head,
subject, modifier and etc). A subclass in a module is an intermediate tree description
that we use along with operators such as heredity, conjunction and disjunction to build
the final tree descriptions.</p>
        <p>There are seventeen subclasses in module VerbalKernel. For instance we model
the long distance dependency between a verb and its subject with the use of
underspecification relation. All tree descriptions of the verbs which are coming out of this
module have a subject node which is an empty node for imperative verbs and a non
empty node for all other verbs.</p>
        <p>Fig. 5. A schema of relations between some of classes in the module VerbalKernel.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Evaluation Results</title>
      <p>
        The English TSNLP (Test Suite for Natural Language Processing) [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] project has
aimed to compose a tool for Natural Language Evaluation purposes. It is a database of
phrases which carry different linguistic phenomena to be used as a reference for
evaluation tasks. These phrases are divided into several categories according to different
linguistic phenomena. Each category has both grammatical and ungrammatical
examples to provide an infrastructure not only for determining the true positives (success
in parsing the grammatical sentences), but also for counting the true negative (failure
to parse ungrammatical sentences).
      </p>
      <p>
        The English TSNLP has 15 different categories and 3 of those contain phenomena
exclusively related to verbs which was the subject of this research. However those three
categories also contain some other structure which was not included in the current
grammar and sentences of those form were put out as well including phrasal verbs,
sentences having there as their subject and sentences containing relative clauses. (e.g.
That she leave is requested by him.) We have used LEOPAR parser [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] to construct
the models for the input sentences and the result of evaluation of the current grammar
can bee seen in table 1.
The major reason of failure in parsing grammatical sentences is that the construction
of verbs up to now requires that every verb tree has a subject node which should be
filled with a real subject while parsing the sentence. However there are situations in
which a verb acts like a noun or an adjective and there is no such a subject node
in its grammar tree. (e.g. He finds the office closed.) Owing to the fact that we were
focusing on construction of the verbs of the main VP of the sentence, other grammatical
phenomena which are related to verbs were not properly treated and this failure should
not be regarded as a weakness of the framework. In appropriate time all this structures
can be constructed in the grammar which is one of the goals in the future works.
      </p>
      <p>The other failure is to mistakenly parse some ungrammatical sentences. One of
the main reasons, among other defeats, is that there are some sentences that are not
correct because of semantic issues which is not recognizable by our grammar.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusions and Future Works</title>
      <p>Interaction grammars is a powerful formalism that has advantages of both unification
grammars and categorial grammars at the same time. Writing a wide coverage grammar
for the English language needs a huge effort and this work can be regarded as the
starting point of such a project. Using tools like XMG to accomplish this goal is quite
helpful and makes the writing and then the tuning of the grammar a lot more easier
than before. Moreover, a high degree of factorization is possible when we separate
source grammar and object grammar which leads to more efficiency.</p>
      <p>The potential future aims of this project are first to continue to construct a
complete grammar for English incorporating all different phenomena in order to parse any
grammatical English sentence and second try to cope with the still open problems like
coordination in the sentences within the same framework.</p>
      <p>On the other hand English TSNLP is a relatively simple set of sentences and is
not quite similar to real corpora. Therefore some steps further would be to enrich the
grammar some how we would be able to cope with real corpora like newspaper or
spoken language corpora where the structure of sentences are more complicated and
our grammar should be able to handle several non quite grammatical sentences too .</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Perrier</surname>
          </string-name>
          , G.:
          <article-title>Interaction grammars</article-title>
          .
          <source>In 18th International Conference on Computational Linguistics</source>
          ,
          <source>CoLing</source>
          <year>2000</year>
          , Sarrebrucken, pages
          <fpage>600</fpage>
          -
          <lpage>606</lpage>
          . (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2. Crabb´e,
          <string-name>
            <surname>B.</surname>
          </string-name>
          :
          <article-title>Grammatical development with XMG</article-title>
          .
          <source>LACL 05</source>
          .
          <article-title>(</article-title>
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3. Retor´e,
          <string-name>
            <surname>C.</surname>
          </string-name>
          :
          <article-title>The Logic of Categorial Grammars</article-title>
          .
          <source>ESSLLI</source>
          <year>2000</year>
          ,
          <article-title>Birmingham</article-title>
          .(
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Steedman</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <string-name>
            <given-names>Categorial</given-names>
            <surname>Grammars</surname>
          </string-name>
          .
          <article-title>A short encyclopedia entry for MIT Encyclopedia of Cognitive Science</article-title>
          . (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Bresnan</surname>
          </string-name>
          , J.:
          <string-name>
            <surname>Lexical-Functional Syntax</surname>
          </string-name>
          .
          <source>ESSLLI</source>
          <year>2000</year>
          ,
          <article-title>Birmingham</article-title>
          .(
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Duchier</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Le</surname>
            <given-names>Roux</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Parmentier</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y.</surname>
          </string-name>
          :
          <article-title>The Metagrammar Compiler: an NLP Application with a Multi-paradigm Architecture</article-title>
          . In Second International Mozart/Oz Conference,
          <string-name>
            <surname>MOZ</surname>
          </string-name>
          <year>2004</year>
          , Charleroi, Belgium, pages
          <fpage>175</fpage>
          -
          <lpage>187</lpage>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Marcus</surname>
            ,
            <given-names>M.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hindle</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fleck. M.M. : D-Theory</surname>
          </string-name>
          :
          <article-title>Talking about Talking about Trees</article-title>
          .
          <source>In 21st Annual Meeting of the Association for Computational Linguistics</source>
          , pages
          <fpage>129</fpage>
          -
          <lpage>136</lpage>
          .(
          <year>1983</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Pullum</surname>
            ,
            <given-names>G.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scholz</surname>
            ,
            <given-names>B.C.</given-names>
          </string-name>
          :
          <article-title>On the Distinction between Model-Theoretic and Generative-Enumerative Syntactic Frameworks</article-title>
          .
          <source>In Logical Aspects of Computational Linguistics</source>
          ,
          <string-name>
            <surname>LACL</surname>
          </string-name>
          <year>2001</year>
          ,
          <string-name>
            <surname>Le</surname>
            <given-names>Croisic</given-names>
          </string-name>
          , France, volume
          <volume>2099</volume>
          <source>of Lecture Notes in Computer Science</source>
          , pages
          <fpage>17</fpage>
          -
          <lpage>43</lpage>
          . Springer Verlag. (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Guillaume</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perrier</surname>
          </string-name>
          , G.:
          <article-title>Interaction Grammars</article-title>
          .
          <source>INRIA Research Report</source>
          <volume>6621</volume>
          : http://hal.inria.fr/inria-00288376/ (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Guillaume</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Le</surname>
            <given-names>Roux</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Marchand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Perrier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Fort</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Planul</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.:</surname>
          </string-name>
          <article-title>A Toolchain for Grammarians</article-title>
          .
          <source>CoLING 08</source>
          ,
          <string-name>
            <surname>Manchester.</surname>
          </string-name>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Planul</surname>
          </string-name>
          , J.:
          <article-title>Construction d'une Grammaire d'Interaction pour l'anglais. M´emoire, pr´esent´e et soutenu publiquement le 27 juin</article-title>
          . (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oepen</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Regnier-Prost</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Netter</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lux</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klein</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Falkedal</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fouvry</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Estival</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dauphin</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Compagnion</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baur</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Balkan</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arnold</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>TSNLP- Test Suite for Natural language Processing</article-title>
          .
          <source>In Proceedings of COLING</source>
          <year>1996</year>
          ,
          <article-title>Kopenhagen</article-title>
          . (
          <year>1996</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>