<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Using Formal Concept Analysis to Acquire Knowledge about Verbs</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ingrid Falk</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Claire Gardent</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alejandra Lorenzo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CNRS/LORIA</institution>
          ,
          <addr-line>Nancy</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>INRIA/Lorraine University</institution>
          ,
          <addr-line>Nancy</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <fpage>151</fpage>
      <lpage>162</lpage>
      <abstract>
        <p>We use Formal Concept Analysis (FCA) to acquire information about verbs as required by Natural Language Processing (NLP) applications. In particular, we show that stable concepts permit creating verb classes with good generalisation power; and that association rules are useful for complementing incomplete verb information.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Natural language processing (NLP) applications aim either to interpret
(analysis) or to produce text data (generation). Because verbs are a central component
of natural language sentences, detailed knowledge about their syntactic and
semantic behaviour is an essential ingredient of many such applications. In
particular, detailed subcategorisation information (that is, information about the
number and the syntactic type of a verb's complements) has repeatedly been
shown to be crucial in enhancing their linguistic coverage and their theoretical
accuracy
        <xref ref-type="bibr" rid="ref2 ref3">([Briscoe and Carroll, 1993], [Carroll and Fang, 2004])</xref>
        .
      </p>
      <p>
        To acquire and structure such knowledge, verb classi cations have been
proposed which group together verbs with similar syntactic and/or semantic
behaviour. On the practical side, verb classes permit capturing generalisations
about verb behaviour thus reducing both the e ort needed to construct a verb
lexicon and the likelihood that errors are introduced when adding new entries.
On the theoretical side, [
        <xref ref-type="bibr" rid="ref10">Levin, 1993</xref>
        ] has shown that syntax re ects semantics
and consequently, that verbs that belong to a syntactic class can be shown to
often share a semantic component.
      </p>
      <p>
        For English, there exist several large scale resources providing verb classes
        <xref ref-type="bibr" rid="ref1 ref12 ref5">(Framenet [Baker et al., 1998], Verbnet [Schuler, 2006] and to a lesser extend
Wordnet [Fellbaum, 1998])</xref>
        in a format that is amenable for use by natural
language processing systems. For French however, existing verb classes are either
too restricted in scope
        <xref ref-type="bibr" rid="ref11">(Volem [Saint-Dizier, 1999])</xref>
        or not su ciently structured
        <xref ref-type="bibr" rid="ref6">(the LADL tables [Gross, 1975])</xref>
        to be directly useful for NLP.
      </p>
      <p>
        In this paper, we explore the use of Formal Concept Analysis (FCA) to
acquire classes for French verbs from the available lexical resources3. Additionally,
we show that association rules can be put to work to extend and complement
3 For other FCA applications for classi cation in NLP see e.g. [
        <xref ref-type="bibr" rid="ref4">Cimiano et al., 2005</xref>
        ].
an existing subcategorisation lexicon. The paper is structured as follows.
Section 2 shows how Dicovalence, a subcategorisation lexicon for French verbs, can
be used to construct a lattice whose concepts are potential verb classes with
objects being verbs and attributes being subcategorisation frames. We use
concept stability as introduced by [Kuznetsov, 2007] for ltering and show that the
resulting set of classes (i) achieves reasonably high coverage (77% of the verbs
contained in the Dicovalence lexicon) and (ii) gives rise to verb classes with good
factorisation power in that most classes associate several frames with the verbs
they contain. In Section 3, we extend the approach to construct verb classes that
integrate both syntactic and semantic information. Finally, Section 4 shows how
applying high con dence association rules derived from the Dicovalence formal
context to a di erent lexicon, permits extending the coverage of Dicovalence.
2
      </p>
      <p>Using formal concept analysis to acquire valency based
verb classes
Formal concept analysis is one of many applicable classi cation and clustering
techniques. We exploit it here to create concepts where the objects are verbs and
the attributes are syntactic frames. Starting from a valency lexicon for French
which associates each verb with a set of valency frames, we build a concept lattice
and extract from it the most stable concepts.We start by presenting the two
lexicons used to build the lattice and to evaluate the acquired verb classes namely,
Dicovalence and VerbNet. We then describe the verb classi cation obtained and
compare it to VerbNet.
2.1</p>
      <p>
        Dicovalence, a valency lexicon for French verbs
The Dicovalence lexicon [
        <xref ref-type="bibr" rid="ref14">van den Eynde and Mertens, 2003</xref>
        ] lists the valency
frames of 3 936 French verbs. A valency frame characterises the number and
the type of the syntactic arguments expected by a verb. For instance, the
valency frames for maintenir can be described as illustrated below. Each frame
describes a set of syntactic arguments and each argument is characterised by a
grammatical function4 and a syntactic category (NP indicates a noun phrase,
PP a prepositional phrase, CL a clitic i.e., a weak pronoun). The use of each
frame is illustrated by an example.
      </p>
      <p>{ SUJ:NP, (OBJ:NP)</p>
      <p>Manifester qu' il a les moyens de maintenir un cap.
{ SUJ:NP, OBJ:NP, ATO:XP</p>
      <p>Le PDG d' Hachette s' est engage a maintenir ouvert le petit robinet d' alimentation qui
permettra a la Cinq de conserver une tresorerie minimale.
{ SUJ:NP, A-OBJ:PP, re :CL</p>
      <p>La poursuite de la baisse de l' investissement productif se maintient a 2,5 % en rythme
annuel depuis la mi-Novembre
4 SUJ refers to the subject grammatical function, OBJ to the object, P-OBJ, A-OBJ
and DE-OBJ describe prepositional objects introduced by any preposition, a or de
respectively, and ATO indicates an object attribute.
{ SUJ:NP, (OBJ:NP), P-OBJ:PP</p>
      <p>L' ecart entre taux des pr^ets et taux de re nancement leur permet de maintenir des concours
su sants aux entreprises demeurees solvables , puis d' accro^tre ce volume a mesure que les
mauvais risques sont provisionnes.
{ SUJ:NP, re :CL</p>
      <p>Le beau temps se maintient
2.2</p>
      <p>
        VerbNet, a classi cation of English verbs
VerbNet
        <xref ref-type="bibr" rid="ref12">([Schuler, 2006])</xref>
        is the largest electronic verb classi cation for English.
It was created manually and classi es 3 626 verbs using 411 classes. Each VerbNet
class includes among other things a set of verbs and a set of valency frames. For
instance, the Hit-18.1 class associates verbs and frames as follows5:
Verbs: batter, beat, bump, butt, drum, hammer, hit, jab, kick, knock,
      </p>
      <p>lash, pound, rap, slap, smack, smash, strike, tap
Frames SUJ:NP,P-OBJ:PP</p>
      <p>SUJ:NP,P-OBJ:PP,P-OBJ:PP
SUJ:NP,OBJ:NP
SUJ:NP,OBJ:NP,P-OBJ:PP</p>
      <p>SUJ:NP,DE-OBJ:Ssub
2.3</p>
      <p>Verb classes as stable concepts
To construct verb classes that group together verbs sharing a set of frames, we
rst build a concept lattice6. The formal context K used to build this lattice is
the triplet hV; F ; Ri such that V is the set of verbs contained in Dicovalence,
F the set of valency frames used in Dicovalence and R the mapping de ned
by Dicovalence between verbs and frames: (v; f ) 2 R i Dicovalence associates
the verb v with the frame f . The concept lattice of this context K contains
2115 concepts i.e., potential verb classes. Clearly however not all these concepts
are interesting verb classes. Classes aim to factorise information and express
generalisations about verbs. Hence, concepts with few (1 or 2) verbs can hardly
be viewed as classes. Similarly, concepts with few frames are less interesting
especially if many of the verb subclasses of the extension of these concepts have
more frames than there are in their intension.</p>
      <p>
        Therefore we need a ltering method to select from the large set of concepts
contained in the lattice those which are most likely to adequately characterise
verb sets. One of the relatively few works addressing this issue of keeping
interesting patterns while removing useless information from lattices based on
potentially noisy data is presented in [
        <xref ref-type="bibr" rid="ref8">Klimushkin et al., 2010</xref>
        ]. These
experiments show that concept stability performs well compared to the other reviewed
measures (concept probability and separability). We therefore use this measure
here and take into account only those concepts that are intensionally stable
        <xref ref-type="bibr" rid="ref9">([Kuznetsov, 2007])</xref>
        .
5 The Verbnet format for valency frames does not mention grammatical functions. We
have added them here to preserve notation consistency and facilitate reading.
6 We used the Galicia Lattice Builder software (http://www.iro.umontreal.ca/
~galicia/) to build the lattices
The intensional stability of a concept (V; F ) is de ned as follows :
i((V; F )) = j fA
      </p>
      <p>V j A0 = F g j
2jV j</p>
      <p>Intuitively, a more stable concept is less dependant on individual members
in the extension and is therefore more resistant to outliers or other noisy data
items.</p>
      <p>For instance, given the concepts C1 to C8 below, setting the stability
threshold to above 0.5, will lter out all concepts except C2, C6 and C7. If further we
eliminate concepts whose extension is a singleton (classes with one verb only),
then the only extracted verb class will be C2 = hfv1; v2g; ff1; f2; f3gi. That is,
by retaining as verb classes only those concepts whose intensional stability is
high, we produce classes which strike a good balance between the size of the
frame set and that of the verb set.</p>
      <sec id="sec-1-1">
        <title>Concept Extension Intension Stability</title>
        <p>C1 v1,v2,v3 f1 3/8 = 0.37
C2 v1,v2 f1,f2,f3 4/4 = 1
C3 v1,v3 f1 2/4 = 0.5
C4 v2,v3 f1 2/4 = 0.5
C5 v1 f1,f2,f3 1/2 = 0.5
C6 v2 f1,f2,f3 2/2 = 1
C7 v3 f1 2/2 = 1
C8 ; f1,f2,f3 1/1 = 1</p>
      </sec>
      <sec id="sec-1-2">
        <title>Decision</title>
        <p>p</p>
        <p>
          As illustrated by this example, keeping only the more stable concepts
potentially implies that some verbs may be excluded of the classi cation (here v3).
Figure 1 shows how the chosen stability threshold a ects verb coverage that is,
the proportion of Dicovalence verbs covered by the resulting classes. Varying
the stability threshold (from 90 to 76) has little impact on coverage (from 3025
verbs to 3043 verbs i.e., 18 verbs with the stability threshold decreasing from
90 to 76) but a strong impact on the number of classes (from 212 to 506)7.
Overall keeping only concepts with stability in the upper 10% permits covering
approximately 77% of the verbs in Dicovalence. To further assess the impact of
the chosen stability threshold on the verb classes obtained, we compare these
classi cation with respect to their number of singleton verb / frame classes, to
the average number of frames / verb per class and to average harmonic mean of
verb and frame size per class. Table 1 shows how these numbers vary with the
chosen stability threshold and compare them with those for VerbNet. The graphs
in Figure 2 compare the distribution of the verbs in classes wrt. the number of
associated frames for these classi cations and for VerbNet. Focusing rst on the
graphs (Figure 2), we observe that the stability threshold has little impact on
7 We computed concept stability following [
          <xref ref-type="bibr" rid="ref7">Jay et al., 2008</xref>
          ]. Calculating stability is
known to be #P-complete, however [
          <xref ref-type="bibr" rid="ref7">Jay et al., 2008</xref>
          ] show that when the concept
lattice is known it can be computed e ciently by a bottom-up traversal algorithm.
Our experiments con rm these results.
DV verb coverage by descending stability thresholds
380 423 465
d
e
r
e
v
o 5
c 3
s 0
b 3
r
e
v
V
D
f
o
r
e 0
b 3
m 0
u 3
N
0
4
0
3
5
2
0
3
212
against descending stability threshold. The numbers above the points are the
number of concepts in a set.
        </p>
        <p>Distributionofverbsagainstsizeofclassesintermsofframes</p>
        <sec id="sec-1-2-1">
          <title>Distributionofverbsagainstsizeofclassesintermsofframes Distributionofverbsagainstsizeofclassesintermsofframes</title>
          <p>0
0
is 12
x
a
x
vyenb 1000
i
g
s
e
ram 00
ff 8
o
r
e
b
m
nu 00
th 6
i
w
s
se
lcsa 400
n
i
s
rb
e
v 0
fro 20
e
b
m
u
N</p>
          <p>0
isx 60
a
x
y
b
ven 500
i
g
s
e
m
ra 0
f 0
fo 4
r
e
b
m
thnu 300
i
w
s
se
lcsna 200
frvo 100
i
s
rb
e
e
b
m
u
N
0
0
0
1 2 3 4 5 6 7 8 9 10 16
the classi cations obtained with FCA with varying stability thresholds ( g. (a),
(b) and for VerbNet).
the number of verbs being in classes with 1 or 2 frames. With a stability
threshold of 76%, approximately 56% of the verbs are in such classes against 57% with
a threshold of 85 or 86% (not shown in the gure). More generally, a stability
threshold around 85% seems to o er a good compromise between the size of the
frame sets (from 1 to 10 with 43% of the verbs having more than 2 frames), the
overall verb coverage and the number of classes (315 for a threshold of 85% and
285 for a threshold of 86%).</p>
          <p>Table 1 gives more details about the comparative properties of the
various classi cations. Two points give further support for a threshold around 85%.
First, a lower threshold increases the number of classes while a stability threshold
around 85% permits keeping this number down thereby improving the
generalisation and factorisation power of the classi cation. Second, the harmonic mean
of the verb set size and the frame set size increases with the stability threshold.
In other words, the classes obtained with a higher threshold are overall better
balanced and more populated.</p>
          <p>Stability threshold 75% 84% 85% 86% VerbNet
Nb. of classes 506 338 315 285 411
Min. verbs 2 4 4 7 1
Max. verbs 1555 1555 1555 1555 383
Min. frames 1 1 1 1 1
Max. frames 16 10 10 7 25
Classes with 1 verb 0 0 0 0 29
Classes with 1 frame 20 17 17 16 44
Average class size (verbs) 53.06 70.78 75.03 78.48 14.96
Average class size (frames) 3.80 3.55 3.51 3.48 4.02
Average class size (harmonic mean) 5.90 5.98 5.98 6.01 4.67
Total number of verbs 3936 3626</p>
          <p>Total number of frames 136 117</p>
          <p>
            Acquiring syntactico-semantic verb classes
Beth Levin's hypothesis (cf. Section 1) states that syntax correlates with
semantics. To create verb classes which capture both a shared syntactic behaviour (a
shared set of valency frames) and a shared meaning component, we draw on
another verb resource for French namely, the LADL tables
            <xref ref-type="bibr" rid="ref6">([Gross, 1975])</xref>
            . These
tables were speci ed manually over several years by a large team of expert
linguists and contain syntactic and semantic information about French verbs. For
instance, a table might state that the subject of all verbs in that table must be
human; or that the object is a destination, etc. The classes created by the LADL
tables however, are both too ne- and too coarse-grained to be useful for NLP.
They are too coarse-grained in that at the table level, a single subcategorisation
frame and a semantic description is associated with a large set of verbs {
information about the syntactic subclasses corresponding to di erent valency frame
sets is not provided. They are too ne-grained in that within a table, detailed
information is given about each individual verb but not about sub-groups of
verbs.
          </p>
          <p>To create verb classes that are characterised both by a set of valency frames
and by semantic information, we apply the same method as described in Section 2
using as attributes both the valency frames contained in Dicovalence and the
LADL tables identi ers. That is, the formal context used to build the lattice
and extract stable concepts is the context hV; F ; Ri where V is the set of verbs
contained in the intersection of Dicovalence and the LADL tables, F is the
union of the set of valency frames used in Dicovalence with the set of LADL
table identi ers and R the mapping such that (v; f ) 2 R if either Dicovalence
or the LADL tables associates the verb v with the frame/table f . The resulting
context has 3536 verbs and 172 attributes (frames and table identi ers) and
the obtained lattice has 31494 concepts. As before, we rank the concepts by
stability. Additionally, we lter out concepts whose intension does not contain
at least one table identi er and 2 valency frames. In this way, we ensure that
each concept extracted from the FCA lattice assigns the verb group denoted by
the concept extension both a semantic (LADL table description) and a syntactic
characterisation (valency frames). We require that the concept intension contains
at least 2 valency frames since each LADL table is associated with a de ning
valency frame.</p>
          <p>Here is an example class extracted by this method. The class groups together
verbs which indicate a change of state (mainly colour and age) and which can
be used with and without object (Jean rougit / Jean turned red ; Jean rougit le
mur / Jean painted the wall red) and with a sentential de-object (Jean rougit de
ce que Marie l'injure / Jean blushed that Marie insults him).</p>
          <p>Verbs: blanchir (to whiten), bleuir (to turn blue), bl^emir (to turn pale)
p^alir (to turn white), rajeunir (to become younger),
rosir (to turn pink), rougir (to blush ), verdir (to turn green),
vieillir (to become old)
LADL Table: 32RA (Make Adjv), 8 (Verbs with sentential complement in de)
Frames SUJ:NP</p>
          <p>SUJ:NP,OBJ:NP</p>
          <p>SUJ:NP,DE-OBJ:Ssub</p>
          <p>Taking the top 500 concepts obeying the set constraints yields a set of classes
such that each class is associated with one or more semantic label (i.e., LADL
table) and between 2 and 6 valency frames. Furthermore, each resulting class
contains between 9 and 237 verbs with an overall verb coverage of 62% . That is,
the 500 classes cover 62% of the verbs present in the intersection of Dicovalence
and the LADL tables. Overall thus, the classes obtained are interesting in that
they are associated with an informative syntactico-semantic characterisation;
they group together a satisfactory number of verbs; and they permit covering a
majority of verbs covered by the verb resources used. Although coverage could
be better, it is worth stressing that manual resources are always incomplete and
imperfect. It is therefore likely that this incomplete coverage is due to missing
and/or erroneous information either in the LADL lexicon (missing verbs in a
table might prevent a syntactic class to be associated with that class thereby
decreasing verb coverage) or in Dicovalence (missing frames might block a verb
from being integrated in a class). Figure 3 shows for each LADL table the number</p>
          <p>Distribution of tables in classes
40
s
i
x
a
x
e
h
t
n
loe 30
b
a
t
e
h
t
h
it
sw 20
e
s
s
a
lf
c
o
r
e
b
m 10
u
N
0
33 53TS 37E 7 337M L32C 2 1 23A 38R 723M 36R 8L3P 61 32VC 32C 213R 36TD 38LS 32LP 31R 10 23H 36S 11 637M L38H 322R 340L 332R 380L 32L 6L3S 31 57M 3L8 31H 6 53S 39 2AR 3 53R 35L 9 21 L81 15 137M 32NM 38LR 8 4 L38D 374M 5 71 18 I13 41 19</p>
          <p>3 3 3</p>
          <p>Ladl table name
of classes it includes. Interestingly, for most tables (61%), less than 5 classes are
identi ed { this suggest a relatively strong association between the syntactic
frames associated with these classes and the semantic component labelling the
table. There are 5 tables which are assigned no class { these are all relatively
small tables (around 20 verbs) for which no syntactic class could be found whose
verbs were included in the set of verbs contained by the table.
4</p>
          <p>Using association rules to extend the lexicon
Formal concept analysis provides another useful tool for developing verb
resources namely, association rules. We rst introduce them. We then show how
association rules can be used to complement Dicovalence with frame information
derived from another lexical resource.
4.1</p>
          <p>Association rules, con dence and lift
Given a context K = hV; F ; Ri with attributes F , an association rule A ! B
with A; B F relates itemsets of this context i.e., sets of attributes. Thus in
our case, association rules describe dependencies between sets of frames.</p>
          <p>
            Association rules can be evaluated using various metrics such as con dence
and lift
            <xref ref-type="bibr" rid="ref13">([Szathmary, 2006])</xref>
            . The con dence of a rule A ! B captures the
probability of B given A. It is de ned as the ratio between the number of objects
having attributes A and B, and the number of objects having attributes A.
Intuitively, it is the proportion of A that are also B. The con dence of an association
rule A ! B is de ned as:
conf (A ! B) =
          </p>
          <p>P (A [ B) = sup(A [ B)</p>
          <p>P (A) sup(A)
where sup(F1), the support of F1 for F1 F an itemset, is the number of
objects including F1.</p>
          <p>The lift value of an association rule measures the strength of the association
between the antecedent and the consequent. It is de ned as the ratio of the
con dence of the rule and the relative support of the consequent.
lif t(A ! B) =</p>
          <p>P (A [ B)
P (A) P (B)
= conf (A ! B) =
rsup(B)</p>
          <p>rsup(A [ B)
rsup(A) rsup(B)
where the relative support rsup(F1) for F1 F , is sup(F1)= j V j. The lift is
a value between 0 and in nity. A lift value greater than 1 indicates that the
antecedent and the consequent appear more often together than expected.
4.2</p>
          <p>Using association rules to extend Dicovalence
Dicovalence only covers the most frequent verbs of French. Using another verb
lexicon (namely the LADL tables), we exploit association rules derived from the
Dicovalence data to predict frames for verbs not in Dicovalence but that are
partially described in the LADL tables. In this way, we complement Dicovalence
with both the LADL table frame information (each table and thus each verb in
that table is associated with a valency frame) and the information contained in
the inferred frames.</p>
          <p>Based on the context hV; F ; Ri introduced in section 2, we compute8 the
minimal non redundant association rules that is, the set of association rules
F1 ! F2 such that F2 is a closed itemset and F1 is the minimal generator of F2.
We then rank the rules according to both lift and con dence. Figure 4 shows the
distribution of these rules. Most rules have a con dence between 98 and 100%.
Moreover almost all rules have a lift above 1 indicating that the association
between the frame sets related by the rules is higher than chance.</p>
          <p>Next we apply these rules to the (verb, frame) pairs given by the LADL
tables. For each rule, we then compute its applicability as follows. Let Vladl be
the number of verbs occurring in the LADL lexicon and Vlradl be the number of
verbs in the LADL lexicon for which the rule r applies. Then the applicability
of a rule r is the ratio between these two values.
8 We used the Coron system http://coron.loria.fr/site/index.php for computing
the rules and the various metrics.</p>
        </sec>
        <sec id="sec-1-2-2">
          <title>Distribution of rules against confidence Distribution of rules against lift</title>
          <p>0500
4000
se
lfrouR 3000
be
m
uN 000
2
0010
3000
0052
0
lfse 200
uR
rbeom 1500
u
N 0
001
005
0</p>
          <p>0
&lt;75% 75% &lt;=80% &lt;=85% &lt;=90% &lt;=95% &lt;=100%</p>
          <p>Confidence
&lt;=1 &lt;=500 &lt;=1000 &lt;=2000 &lt;=3000 &lt;=4000</p>
          <p>Lift
(a) Rules distribution against con - (b) Rules distribution against lift
dence</p>
          <p>We also evaluate the usefulness of a rule i.e., its potential for discovering new
frames. Let Flradl be the number of frames present in the LADL for the verbs to
which rule r applies and let N ewFlradl be the number of frames inferred by the
application of rule r and not present in the LADL lexicon, then the usefulness
of a rule r is de ned as the ratio between the number of discovered frames and
the number of frames contained in the verb entries to which the rule applies:
usef ulness(r) =</p>
          <p>N ewFlradl</p>
          <p>Flradl
Figure 5 plots both, the rule applicability and the rule usefulness against the
number of rules for the best 30 rules according to the applicability criterion (i.e.,
picking the 30 rules with highest applicability). Although most rules apply to less
than 5% of the LADL items, the usefulness score mostly ranges between 10 and
40% . Overall, applying these 30 best rules to the (verb,frame) pairs contained in
the LADL tables permits inferring 1435 (verb,frame) pairs. The con dence for
these rules ranges from 0.762 to 1 with most rules having a con dence close to
1. Their lift ranges from 1.174 to 6.33, and their support from 2 to 586. That is,
rules with high applicability are also reliable in that they display good con dence
and lift score above 1. By comparison, when applying the 30 rules with higher
support values, lift and con dence in that ranking order, we obtain an increase
of 1157 verbs. In sum, to maximise both the number of frames inferred and their
reliability, a good strategy is either to rank rules by support or applicability, and
then take the n best rules wrt. to the chosen ranking.</p>
        </sec>
        <sec id="sec-1-2-3">
          <title>Distribution of Rules against Applicability for the 30 best rules Distribution of Rules against Usefulness for the 30 best rules</title>
          <p>52
20
5
s
fruoe 15
l
r
eb
m
uN 01
10
8
2
lrseu 6
fr
oeb
uNm 4
0</p>
          <p>0
Much work on acquiring verb information for NLP has focused on identifying
so called alternations i.e., pairs of valency frames that are often simultaneously
true of a verb and classes that associate sets of verbs with syntactic and/or
semantic information. The results presented in this paper suggest that FCA is
an appropriate framework for modelling such knowledge acquisition process.</p>
          <p>Concepts naturally model the association of verbs and syntactic and/or
semantic information. Moreover, like fuzzy clustering, FCA permits \soft
clustering" in that a data element may belong to several classes { a property of
the produced classi cations which is essential for our task since verbs (e.g., to
y) are highly ambiguous and may belong to several syntactic and/or semantic
classes. Sections 2 and 3 show that stable concepts permit creating classes with
good generalisation and factorisation power (e.g., a few hundred syntactic classes
to cover roughly 3 500 verbs) and linguistically sound, empirical content (good
average number of verbs and frames within the classes).</p>
          <p>Association rules on the other hand, are a natural way to capture alternations
while the various evaluation metrics proposed in the literature permit ranking
them according to such criteria as reliability (con dence), strength of association
(lift) and breadth of application (support). Section 4 illustrates this by showing
how association rules can be used to extend an incomplete lexicon with additional
valency information.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Acknowledgments</title>
      <p>The research reported in this paper was partially supported by the French
National Research Agency (ANR) in the context of the Passage project
(ANR-06MDCA-013). We would like to thank Yannick Toussaint for his suggestion to
use stability as a lter as well as for general feedback and help on the topics
addressed in this paper.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Baker</surname>
          </string-name>
          et al.,
          <year>1998</year>
          . Baker,
          <string-name>
            <given-names>C. F.</given-names>
            ,
            <surname>Fillmore</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. J.</given-names>
            , and
            <surname>Lowe</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. B.</surname>
          </string-name>
          (
          <year>1998</year>
          ).
          <article-title>The berkeley FrameNet project</article-title>
          .
          <source>In Proceedings of the 17th International Conference on Computational Linguistics</source>
          , volume
          <volume>1</volume>
          , pages
          <fpage>86</fpage>
          {
          <fpage>90</fpage>
          , Montreal, Quebec, Canada. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Briscoe</surname>
          </string-name>
          and Carroll,
          <year>1993</year>
          . Briscoe,
          <string-name>
            <given-names>T.</given-names>
            and
            <surname>Carroll</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          (
          <year>1993</year>
          ).
          <article-title>Generalized probabilistic lr parsing of natural language (corpora) with uni cation-based grammars</article-title>
          .
          <source>Comput. Linguist.</source>
          ,
          <volume>19</volume>
          (
          <issue>1</issue>
          ):
          <volume>25</volume>
          {
          <fpage>59</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Carroll</surname>
          </string-name>
          and Fang,
          <year>2004</year>
          . Carroll,
          <string-name>
            <given-names>J.</given-names>
            and
            <surname>Fang</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. C.</surname>
          </string-name>
          (
          <year>2004</year>
          ).
          <article-title>The automatic acquisition of verb subcategorisations and their impact on the performance of an hpsg parser</article-title>
          .
          <source>In IJCNLP</source>
          , pages
          <volume>646</volume>
          {
          <fpage>654</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Cimiano</surname>
          </string-name>
          et al.,
          <year>2005</year>
          . Cimiano,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Hotho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            , and
            <surname>Staab</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          (
          <year>2005</year>
          ).
          <article-title>Learning concept hierarchies from text corpora using formal concept anaylsis</article-title>
          .
          <source>Journal of Arti cial Intelligence Research (JAIR)</source>
          ,
          <volume>24</volume>
          :
          <fpage>305</fpage>
          {
          <fpage>339</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Fellbaum</surname>
          </string-name>
          ,
          <year>1998</year>
          . Fellbaum,
          <string-name>
            <surname>C.</surname>
          </string-name>
          , editor (
          <year>1998</year>
          ).
          <article-title>WordNet: An Electronic Lexical Database</article-title>
          . MIT Press, Cambridge, MA.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Gross</surname>
          </string-name>
          ,
          <year>1975</year>
          . Gross,
          <string-name>
            <surname>M.</surname>
          </string-name>
          (
          <year>1975</year>
          ).
          <article-title>Methodes en syntaxe</article-title>
          . Hermann, Paris.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Jay</surname>
          </string-name>
          et al.,
          <year>2008</year>
          . Jay,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Kohler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            , and
            <surname>Napoli</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          (
          <year>2008</year>
          ).
          <article-title>Analysis of social communities with iceberg and stability-based concept lattices</article-title>
          .
          <source>In ICFCA'08: Proceedings of the 6th international conference on Formal concept analysis</source>
          , pages
          <volume>258</volume>
          {
          <fpage>272</fpage>
          , Berlin, Heidelberg. Springer-Verlag.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Klimushkin</surname>
          </string-name>
          et al.,
          <year>2010</year>
          . Klimushkin,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Obiedkov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            , and
            <surname>Roth</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Approaches to the selection of relevant concepts in the case of noisy data</article-title>
          . In Kwuida, L. and
          <string-name>
            <surname>Sertkaya</surname>
          </string-name>
          , B., editors,
          <source>Formal Concept Analysis</source>
          , volume
          <volume>5986</volume>
          of Lecture Notes in Computer Science, chapter
          <volume>18</volume>
          , pages
          <fpage>255</fpage>
          {
          <fpage>266</fpage>
          . Springer Berlin / Heidelberg, Berlin, Heidelberg.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Kuznetsov</surname>
          </string-name>
          ,
          <year>2007</year>
          . Kuznetsov,
          <string-name>
            <surname>S. O.</surname>
          </string-name>
          (
          <year>2007</year>
          ).
          <article-title>On stability of a formal concept</article-title>
          .
          <source>Annals of Mathematics and Arti cial Intelligence</source>
          ,
          <volume>49</volume>
          (
          <issue>1-4</issue>
          ):
          <volume>101</volume>
          {
          <fpage>115</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Levin</surname>
          </string-name>
          ,
          <year>1993</year>
          . Levin,
          <string-name>
            <surname>B.</surname>
          </string-name>
          (
          <year>1993</year>
          ).
          <article-title>English Verb Classes and Alternations: a preliminary investigation</article-title>
          . University of Chicago Press, Chicago and London.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Saint-Dizier</surname>
          </string-name>
          ,
          <year>1999</year>
          . Saint-Dizier,
          <string-name>
            <surname>P.</surname>
          </string-name>
          (
          <year>1999</year>
          ).
          <article-title>Alternation and verb semantic classes for french: Analysis and class formation. In Predicative forms in natural language and in lexical knowledge bases</article-title>
          . Kluwer Academic Publishers.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Schuler</surname>
          </string-name>
          ,
          <year>2006</year>
          . Schuler,
          <string-name>
            <surname>K. K.</surname>
          </string-name>
          (
          <year>2006</year>
          ).
          <article-title>VerbNet: A Broad-Coverage, Comprehensive Verb Lexicon</article-title>
          .
          <source>PhD thesis</source>
          , University of Pennsylvania.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Szathmary</surname>
          </string-name>
          ,
          <year>2006</year>
          . Szathmary,
          <string-name>
            <surname>L.</surname>
          </string-name>
          (
          <year>2006</year>
          ).
          <article-title>Symbolic Data Mining Methods with the Coron Platform</article-title>
          .
          <source>PhD Thesis</source>
          in Computer Science, University Henri Poincare { Nancy 1, France.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <source>van den Eynde and Mertens</source>
          ,
          <year>2003</year>
          . van den Eynde, K. and
          <string-name>
            <surname>Mertens</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          (
          <year>2003</year>
          ).
          <article-title>La valence: l'approche pronominale et son application au lexique verbal</article-title>
          .
          <source>Journal of French Language Studies</source>
          ,
          <volume>13</volume>
          :
          <fpage>63</fpage>
          {
          <fpage>104</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>