<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>PCFG Extraction and Pre-typed Sentence Analysis</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>No´emie-Fleur Sandillon-Rezer nfsr@labri.fr</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>LaBRI, CNRS 351 cours de la Lib ́eration</institution>
          ,
          <addr-line>33405 Talence</addr-line>
        </aff>
      </contrib-group>
      <fpage>150</fpage>
      <lpage>159</lpage>
      <abstract>
        <p>We explain how we extracted a PCFG (probabilistic contextfree grammar) from the Paris VII treebank. First we transform the syntactic trees of the corpus in derivation trees. The transformation is done with a generalized tree transducer, a variation from the usual top-down tree transducers, and gives as result some derivation trees for an AB grammar, which is a subset of a Lambek grammar, containing only the left and right elimination rules. We then have to extract a PCFG from the derivation tree. For this, we assume that the derivation trees are representative of the grammar. The extracted grammar is used, through a slightly modified CYK algorithm that takes in account the probabilities, for sentences analysis. It enables us to know if a sentence is include in the language described by the grammar.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        This article describes a method to extract a PCFG from the Paris VII treebank [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The
first step is the transformation of the syntactic trees of the treebank into derivation
trees representative of an AB grammar [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], which corresponds to the elimination rules
of Lambek Calculus, as shown in Fig. 1. We chose an AB grammar because we want our
approach to be potentially compatible with some usual learning algorithms, like the
one of Buszkowski and Penn [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] or Kanazawa [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Once we have the derivation trees,
we extracted the PCFG from them, and use it for sentence analysis. This analysis helps
us to know how we can improve our grammar and all the processing line used to get
it, by analyzing why some correct sentences cannot be parsed or why some incorrect
ones are still parsed. In a more long-viewed aim, parsing french sentences can be use
for grammatical checking, and with semantic information over a lexicon, the grammar
could be used for generate coherent sentences.
      </p>
      <p>
        The Paris VII treebank [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] contains sentences from the newspaper Le Monde,
analyzed and annotated by the Laboratoire de linguistique de Paris VII. The flat shape of
trees does not allow the direct application of a usual learning algorithm, so we decided
to use a generalized tree transducer. For our work, we use a subpart of the treebank,
on a parenthesized form, composed by 12, 351 sentences. Even if the whole treebank
was in an XML form, the parenthesized form is easier to treat with the transducer.
The 504 sentences left aside will be an evaluation treebank that we use as a control
group.
      </p>
      <p>
        Another new treebank, Sequoia [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], which is composed by 3, 200 sentences coming
from different horizons, will also be used for experimentation. It is annotated using the
same convention as the French Treebank.
      </p>
      <p>
        This article will firstly overfly the transducer we use to transform syntactic trees
into derivation trees, then we will focus on PCFG extraction. In a third part, we will
detail the experimental results, obtained by using our PCFG to find the best analysis
for a sentence, via the CYK algorithm [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ].
2
      </p>
      <p>
        Generalized Tree Transducer
It exists many way to make a syntactic analysis of a treebank, as we can see with the
work of Hockenmaier [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], or Klein and Manning [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], but they were not applicable over
the French Treebank or they did not gave simple AB grammar.
      </p>
      <p>
        The transducer we created is the central point of the grammar extraction process.
Indeed, the binarization of syntactic trees parametrize the extracted grammar. We
based our works on usual derivation rules of an AB grammar used in computational
linguistic and the annotations of the treebank itself [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The annotations give two types
of information about the trees :
POS-tag: the Part-Of-Speech tags, booked for the pre-terminal nodes, indicates the
POS-tag of the daughter node. For example NC will be used for Common Noun,
DET for a determinant, etc.
      </p>
      <p>Phrasal types: the nodes which are not a terminal or a pre-terminal node are
annotated with their syntactic categories and sometime the role of the node. A NP-SUJ
node will correspond to a Noun Phrase used as a subject for the sentence.
For the usual derivation rules, they are instantiation of elimination rules of Lambek
calculus (see Fig. 1). We based ourselves on other annotation methods : a NP node will
have the type np, a sentence type will be s, a preposition phrase taken as an argument
will have the type pp, and so on.</p>
      <p>
        A transducer, like defined by TATA [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], is an automaton which read an input and
write something on output. It can be applied to trees and transform the shape of them.
The transducers have some feature especially important for our work. Indeed, they are
non erasing (it ensures us that we do not lost informations during the transduction),
linear (the transduction will not change the order of the words in the sentence) and
ǫ-free (gives more control over the transduction). The G transducer, developed by
Sandillon-Rezer in [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], has additional features that feet better with the global shape
of the treebank:
Recursivity: Our transducer can apply a rule to a node, looking only on its label but
with an arbitrary arity. It generalize the usual definition of transduction rule, by
matching node with an arbitrary number of daughters. The study of each specific
case of the rule can transform the recursive rule into a set of ordinary rules.
Parametrization: We allow the rules to have some generic nodes which replace a
node from a finite set of nodes. This quantification is equivalent to write each
instantiation of the generic node.
      </p>
      <p>Priority system: As our transducer needs to be deterministic, we decided to apply
the rules always in a given order. It ensures us to have only one output tree.</p>
      <p>The transduction rules have been written from a systematic analysis of the different
shapes we could find in the treebank. As an example, a syntactic tree from the treebank
and its transduction is shown Fig. 2.</p>
      <p>VN</p>
      <p>PP-P OBJ
CLS-SUJ V</p>
      <p>On
ouvre</p>
      <p>P
sur</p>
      <p>NP</p>
      <p>NPP
la Cinq
VN s
ouvre</p>
      <p>P pp/np
sur</p>
      <sec id="sec-1-1">
        <title>SENT s</title>
        <p>NP np
NPP np
la Cinq</p>
      </sec>
      <sec id="sec-1-2">
        <title>SENT</title>
      </sec>
      <sec id="sec-1-3">
        <title>PONCT</title>
        <p>,</p>
      </sec>
      <sec id="sec-1-4">
        <title>TEXT txt</title>
        <p>s\s
CC
mais</p>
      </sec>
      <sec id="sec-1-5">
        <title>COORD VN</title>
      </sec>
      <sec id="sec-1-6">
        <title>Sint</title>
      </sec>
      <sec id="sec-1-7">
        <title>PONCT</title>
        <p>.</p>
        <p>AdP-MOD
CLS-SUJ V
ADV
on
glisse assez
ADV
vite</p>
      </sec>
      <sec id="sec-1-8">
        <title>PONCT s\txt</title>
        <p>.</p>
      </sec>
      <sec id="sec-1-9">
        <title>COORD s\s</title>
        <p>mais</p>
        <p>VN s</p>
        <p>AdP-MOD s\s
CLS-SUJ np V np\s ADV (s\s)/(s\s) ADV s\s
on
glisse
assez
vite
CLS-SUJ np
np\s</p>
      </sec>
      <sec id="sec-1-10">
        <title>PONCT (s\s)/(s\s)</title>
        <p>On V (np\s)/pp
PP-P OBJ pp ,
CC (s\s)/s</p>
      </sec>
      <sec id="sec-1-11">
        <title>Sint s</title>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Grammar extraction</title>
      <p>Even if the lexicon, extracted from the derivation trees, is representative enough of
an AB grammar, and gives a probabilistic distribution of different types for words, it
limits the sentence analysis to the sole lexicon of the treebank. This is the reason we
decided to extract a PCFG from the derivation trees.</p>
      <p>The output trees give both syntactic (we keep the initial labels) and structural
information over sentences. We decided to proceed to a preprocessing step, in order for
the user to control the extracted information. The only part of information we always
keep is the type of nodes. The extracted grammar is a PCFG, the probabilities are
computed based on their root node. We remind that our grammar is defined by a tuple
{N, T, S, R}, where N is the set of non-terminal symbols (internal nodes of trees), T
the set of terminal symbols (typed leaves), S the initial symbol1 and R the set of rules.</p>
      <p>The extraction algorithm parses the trees and stores the derivation rules it sees. A
rule is composed by a root and one or two daughters, then this is the usual case of a
right or left elimination rule (a → a/b b or a → b a\b). Otherwise, it is only a type
transmission which appears when a noun phrase is composed by a proper name only;
or at the pre-terminal node level when the POS-tag node transmits type to the leaf.
The probabilities are computed on a root related group.</p>
      <p>
        The table 1 summarizes the grammars potentially generated. Each one presents
useful information: the first one, from the derivation trees without preprocessing step,
keeps the syntactic informations given by the treebank. The others are more useful for
the application of a sentence analysis algorithm, like CYK (see section 4), on non-typed
sentences. The table 2 shows some extracted rules.
The analysis process can be subdivided in two parts. On the one hand, we have to type
the words, while staying as close as possible to standard Lambek derivations. On the
other hand, the needed rules must belong to the input grammar.
By gathering the leaves of the derivation trees, we have a lexicon, as we can see Fig. 3.
However, a typing system based only on this lexicon reduces the possibility of parsing to
the sentences composed by words from the French Treebank. We decided to type words
with the Supertagger (see Moot [
        <xref ref-type="bibr" rid="ref15 ref16">15, 16</xref>
        ]), which enabled us to validate the Supertagger
results and to analyze sentences which did not occur in the Paris VII treebank.
      </p>
      <p>The Supertagger is trained with the lexicon extracted from the transduced
derivation trees.</p>
      <p>
        1S = TXT:txt or txt, depending of the preprocessing step.
We decided to use the CYK [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] algorithm, already tested and considered as a reference,
with a probabilistic version for the parsing of sentences. We removed the typing step
done initially by CYK with the rules n1 → t1 1, replaced by the Supertagger work. We
use the simplest grammar ( of 3, 494 rules ) for the analysis. The first test, to assure
the correct running of the program, was to re-generate the trees from the transduced
sentences with the grammar extracted from the derivation trees. Then, we tested the
analysis with sentences typed by the Supertagger, using the grammar extracted from
the main treebank (see results section 5.
      </p>
      <p>The derivation trees corresponding to the sentence ”Celui-ci a import´e `a tout va
pour les besoins de la r´eunification.” (”This one imported without restraint for the
need of reunification need.”) are shown in Fig. 4. We took the two most probable trees,
typed by the Supertagger and analyzed with the grammar extracted from the main
treebank. Two types of information are relevant to choose the best trees: we took into
account the probability and the complexity of types. However, it is known that the
comparison between two trees that do not have the same shape or leaves is complex.
The main difference between the two trees is the prepositional phrase attachment; the
most probable tree is more representative of the original treebank.
`a tout va s\s pour (s\s)/np</p>
      <p>np
s</p>
      <p>. s\txt
txt
s\s
les np/n</p>
      <p>n
besoins n</p>
      <p>n\n
de (n\n)/np</p>
      <p>np
la np/n r´eunification n
txt
s</p>
      <p>. s\txt
Celui-ci np</p>
      <p>np\s
a (np\s)/(np\sp)</p>
      <p>np\sp
(np\sp)/pp
import´e (np\sp)/pp a` tout va ((np\sp)/pp)\((np\sp)/pp)
pp
pour pp/np</p>
      <p>np
les np/n</p>
      <p>n
besoins n</p>
      <p>n\n
de (n\n)/np</p>
      <p>np
la np/n r´eunification n</p>
    </sec>
    <sec id="sec-3">
      <title>Results and evaluation</title>
    </sec>
    <sec id="sec-4">
      <title>G-Transducer</title>
      <p>From now, the transducer treats at least 88% of the corpora (see Table 3 for details).
The lower percentage on the evaluation treebank can be explained by the study of the
remaining sentences: they are colloquial and complex. On Sequoia, the better results
can be explain by the greater simplicity of sentences.</p>
      <p>
        The use of rules, for the main treebank, is summarized in Table 4. We note that
even if many rules are used infrequently, they do not have a real weight in the global
use of the rules. Some of the important rules are shown in the Table 5 in the Tregex [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]
parenthesized way. We were surprised too by the few occurrences of the rule that treats
the determiner at the beginning of a noun phrase, despite the amount of NP in the
treebank, but there is a rule which treats only the case of a noun phrase composed by
a determinant and a noun, called 8, 892 times.
      </p>
      <p>The derivation trees of the three corpora allow us to extract three different
grammars, and in addition, lexicon were created, containing words and their formulas. The
lexicon covers 96.6% of the words of the Paris VII treebank, i.e. 26, 765 words on
27, 589, and for Sequoia, it covers 95.1%, i.e. 18, 350 word on 19, 284.</p>
      <p>Sentence parsing
The parsing of typed sentences is done with the grammar extracted from the main
treebank, which is the most covering one. Each treebank has been divided in two part,
the transduced one and the non-transduced one. For the Supertagger, we used a β
equal to 0.01: even if the supertagging and the parsing steps are slower, the results
are much better than a β of 0.05. The results are gathered in Table 6. We note that
non-transduced sentences are nevertheless analyzed, even if the results are less accurate
than for the other sentences.</p>
      <p>
        We also tested the precision of the Supertagger (for the whole tests, see [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]): the
Supertagger can adjust the number of types given to a word, with the β parameter. It
enables to select formula which the confidence of the Supertagger is at least β times
the confidence of the first supertag. We summarize the time spent to analyze the
fragment of 440 sentences of the evaluation corpus, the effectiveness of the algorithm
by modifying β in Table 7. The high number of types is due to the limitations of
AB-grammar. When the Supertagger is used for multimodal categorial grammars, the
average number of formulas is around 2.4 with a β equal to 0.01 and 4.5 with β = 0.001;
the correctness is better too, with respectively a rate of 98.2% and 98.8%.
      </p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion and prospects</title>
      <p>In this article, we have briefly introduced the G-transducer principle, that we used to
transform syntactic trees into derivation trees of an AB grammar. Then we explained
how we extracted a PCFG and used it in the sentence analysis. The experimental
results of the CYK algorithm used with typed sentences enabled us to compare the
annotation from the transducer and the Supertagger.</p>
      <p>
        However, the work is still ongoing, and opens many horizons. Of course, we want to
extend the coverage of the transducer to exceed 95% and simplify the types of words.
The main problem is that only complex cases remain, but we should be able to find
derivation trees, given that we can analyze a part of them with the CYK algorithm.
In order to improve parsing precision, we intend to integrate modern techniques such
as those of [
        <xref ref-type="bibr" rid="ref20 ref3">3, 20</xref>
        ] into our parser. Using the Charniak method [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], we would like to
transform our grammar into a highly lexicalized grammar.
      </p>
      <p>Given that AB grammars may seem limiting in the case of a complex language, we
wish to transform our transducer into a tree to graph transducer. This way, we would
be able to use the whole Lambek calculus.</p>
      <p>The XML version of the Treebank gives more informations on the words, like the
tense of verbs. A major evolution would be to reflect this information into our
transducer, even if it implies many transformation for it.</p>
      <p>
        Our work and programs are available on [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], under GNU General Public Licence.
      </p>
    </sec>
    <sec id="sec-6">
      <title>References</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. Abeill´e,
          <string-name>
            <given-names>A.</given-names>
            , Cl´ement, L.,
            <surname>Toussenel</surname>
          </string-name>
          ,
          <string-name>
            <surname>F.</surname>
          </string-name>
          :
          <article-title>Building a treebank for French</article-title>
          . Treebanks, Kluwer, Dordrecht (
          <year>2003</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2. Abeill´e,
          <string-name>
            <given-names>A.</given-names>
            , Cl´ement, L.,
            <surname>Toussenel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            :
            <surname>Annotation</surname>
          </string-name>
          Morpho-syntaxique. http://llf.linguist.jussieu.fr (
          <year>2003</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Auli</surname>
            <given-names>M.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Lopez</surname>
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Efficient CCG Parsing: A* versus Adaptive Supertagging</article-title>
          .
          <source>In Proceedings of the Association for Computational Linguistics</source>
          (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Buszkowski</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Penn</surname>
          </string-name>
          , G.:
          <article-title>Categorial grammars determined from linguistic data by unification</article-title>
          .
          <source>Studia Logica</source>
          (
          <year>1990</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Candito</surname>
            <given-names>M.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Seddah D.</surname>
          </string-name>
          :
          <article-title>Le corpusSequoia : annotation syntaxique et exploitation pour l'adaptation d'analyseur par pont lexical TALN'2012 proceedings</article-title>
          , Grenoble, France (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Charniak</surname>
            <given-names>E.</given-names>
          </string-name>
          <article-title>A maximum-entropy-inspired parser</article-title>
          .
          <source>In Proceedings of the 1st Annual Meeting of the North American Chapter of the ACL (NAACL)</source>
          , Seattle (
          <year>2000</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Comon</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dauchet</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jacquemard</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lugiez</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tison</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tommasi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Tree automata techniques and applications (</article-title>
          <year>1997</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Hockenmaier</surname>
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Data and Models for Statistical Parsing with Combinatory Categorial Grammar</article-title>
          .
          <source>PhD thesis</source>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Hopcroft</surname>
            <given-names>J.E.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Ullman J.D.</surname>
          </string-name>
          :
          <article-title>Introduction to Automata Theory, Languages, and Computation</article-title>
          . Addison-Wesley Publishing Company (
          <year>1979</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Klein</surname>
            <given-names>D.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Manning</surname>
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Accurate Unlexicalized Parsing</article-title>
          .
          <source>In Proceedings of the Association for Computational Linguistics</source>
          (
          <year>2003</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Knuth</surname>
            <given-names>D.E.</given-names>
          </string-name>
          :
          <source>The Art of Computer Programming</source>
          Volume
          <volume>2</volume>
          :
          <string-name>
            <given-names>Seminumerical</given-names>
            <surname>Algorithms</surname>
          </string-name>
          (3rd ed.)
          <string-name>
            <surname>Addison-Wesley Professional</surname>
          </string-name>
          (
          <year>1997</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Kanazawa</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Learnable Classes of Categorial Grammars. Center for the Study of Language and Information (</article-title>
          <year>1998</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Lambek</surname>
          </string-name>
          , J.:
          <source>The Mathematics of Sentence Structure</source>
          . (
          <year>1958</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Levy</surname>
            <given-names>R.</given-names>
          </string-name>
          , Andrew G.:
          <article-title>Tregex and Tsurgeon: tools for querying and manipulating tree data structures</article-title>
          . http://nlp.stanford.edu/software/tregex.shtml (
          <year>2006</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Moot</surname>
          </string-name>
          , R.:
          <article-title>Automated extraction of type-logical supertags from the Spoken Dutch Corpus. Complexity of Lexical Descriptions and its Relevance to Natural Language Processing: A Supertagging Approach (</article-title>
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Moot</surname>
          </string-name>
          , R.:
          <article-title>Semi-automated Extraction of a Wide-Coverage Type-Logical Grammar For French</article-title>
          .
          <source>Proceedings TALN</source>
          <year>2010</year>
          , Montreal (
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Sandillon-Rezer</surname>
            ,
            <given-names>N.F.</given-names>
          </string-name>
          :
          <article-title>Learning categorial grammar with tree transducers</article-title>
          .
          <source>ESSLLI Student Session Proceedings</source>
          (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Sandillon-Rezer</surname>
            ,
            <given-names>N-F.</given-names>
          </string-name>
          : Extraction de PCFG et analyse de phrases pr´e-typ´
          <source>ees Recital</source>
          <year>2012</year>
          , Grenoble (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Younger</surname>
            <given-names>D.H.</given-names>
          </string-name>
          :
          <article-title>Context Free Grammar processing in n3</article-title>
          . (
          <year>1968</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Zang</surname>
            <given-names>Y.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Clarck</surname>
            <given-names>S.:</given-names>
          </string-name>
          <article-title>Shift-Reduce CCG Parsing</article-title>
          .
          <source>In Proceedings of the Association for Computational Linguistics</source>
          (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Sandillon-Rezer</surname>
            ,
            <given-names>N.F.</given-names>
          </string-name>
          : http://www.labri.fr/perso/nfsr/ (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>