=Paper= {{Paper |id=Vol-2006/paper079 |storemode=property |title=Developing a Large Scale FrameNet for Italian: the IFrameNet Experience |pdfUrl=https://ceur-ws.org/Vol-2006/paper079.pdf |volume=Vol-2006 |authors=Silvia Brambilla,Danilo Croce,Fabio Tamburini,Roberto Basili |dblpUrl=https://dblp.org/rec/conf/clic-it/BrambillaCT017 }} ==Developing a Large Scale FrameNet for Italian: the IFrameNet Experience== https://ceur-ws.org/Vol-2006/paper079.pdf
                        Developing a large scale FrameNet for Italian:
                                 the IFrameNet experience

                                                                                                           §
     Roberto Basili°              Silvia Brambilla§             Danilo Croce°          Fabio Tamburini
      °                                                §
       Dept. of Enterprise Engineering                     Dept. of Classic Philology and Italian Studies
      University of Rome Tor Vergata                                  University of Bologna
    {basili,croce}@info.uniroma2.it                             fabio.tamburini@unibo.it,
                                                            silvia.brambilla@studio.unibo.it

                                                             ian Portuguese, German, Spanish, Japanese, Swe-
                      Abstract                               dish and Korean.
                                                                 All these projects are based on the idea that
     English. This paper presents work in pro-               most of the Frames are the same among languages
     gress for the development of IFrameNet, a               and that, thanks to this, it is possible to adopt
     large-scale, computationally oriented, lexi-            Berkeley’s Frames and FEs and their relations,
     cal resource based on Fillmore’s frame se-              with few changes, once all the language-specific
     mantics for Italian. For the development of             information has been cut away (Tonelli et al. 2009,
     IFrameNet linguistic analysis, corpus-                  Tonelli 2010).
     processing and machine learning techniques                  With regard to Italian, over the past ten years
     are combined in order to support the semi-              several research projects have been carried out at
     automatic development and annotation of                 different universities and Research Centres. In par-
     the resource.                                           ticular, the ILC-CNR in Pisa (e.g. Lenci et al. 2008;
                                                             Johnson and Lenci 2011), FBK in Trento (e.g.
     Italiano. Questo articolo presenta un work              Tonelli et al. 2009, Tonelli 2010) and the Universi-
     in progress per lo sviluppo di IFrameNet,               ty of Rome, Tor Vergata (e.g. Pennacchiotti et al.
     una risorsa lessicale ad ampia copertura,               2008, Basili et al. 2009) proposed automatic or
     computazionalmente orientata, basata sulle              semiautomatic methods to develop an Italian
     teorie di Semantica dei Frame proposte da               FrameNet. However, as of today, a resource even
     Fillmore. Per lo sviluppo di IFrameNet so-              remotely equivalent to Berkeley’s FrameNet (BFN)
     no combinate analisi linguistica, corpus-               is still missing.
     processing e tecniche di machine learning al                As a lexical resource of this kind is useful in
     fine di semi-automatizzare lo sviluppo della            many computational applications (such as Human-
     risorsa e il processo di annotazione.                   Robot interaction), a new effort is currently being
                                                             jointly made at the universities of Bologna and
1     Introduction                                           Roma, Tor Vergata. The IFrameNet project aims to
                                                             develop a large-coverage FrameNet-like resource
Firstly developed at the University of Berkeley              for Italian, relying on robust and scalable methods,
(California) in 1997, FrameNet adopts theories               in which the automatic corpus processing is con-
from Frame Semantics (Fillmore 1976, 1982,                   sistently integrated with manual lexical analysis. It
1985) to NLP and explains words’ meanings ac-                builds upon the achievements of previous projects
cording to the semantic frames they evoke. It illus-         that automatically harvested FrameNet LUs ex-
trates semantic frames (i.e. schematizations of pro-         ploiting both distributional and WordNet based
totypical events, relations or entities in the reality),     models (Pennacchiotti et al. 2008). Since the LUs
through the involved participants (called frame el-          induction is a noisy process, the data thus obtained
ements, FEs) and the evoking words (or, better, the          need to be manually refined and validated.
lexical units, LUs). Moreover, FrameNet aims to                  The aim is also to provide Sample Sentences for
give a valence representation of the lexical units           LUs with the highest corpus frequency. On the one
and underline the relations between frames and               side, they will be derived from already existing
between frame elements (Baker et al. 1998).                  resources such as the HuRIC corpus (Bastianelli
    The initial American project has since been ex-          2014) or the EvalIta2011 FLaIT task data: FBK set
tended to other languages: French, Chinese, Brazil-          (Tonelli, Pianta 2008) and ILC set (Lenci et al.
                                                             2012). On the other side, candidate sentences will
also be extracted through semi-automatic distribu-       tional and paradigmatic lexical information (i.e.
tional analysis of a large corpus - i.e. CORIS (Ros-     derived from WordNet) to assign unknown LUs to
sini Favretti et al. 2002) - and refined through lin-    frames. In particular, distributional models are used
guistic analysis and manual validation of data thus      to select a list of frames suggested by the corpus’
obtained.                                                evidence and then the plausible lexical senses of
                                                         the unknown LU are used to re-rank proposed
                                                         frames.
2     The development of the large scale                     In order to rely on comparable representations
      IFrameNet resource                                 for LUs and sentences for transferring semantic
                                                         information from the former to the latter, we ex-
The need for a large-scale resource cannot be satis-     ploit Distributional Models (DM) of Lexical Se-
fied without resorting to a semi-automatic process       mantics, in line with (Pennacchiotti et al. 2008) and
for the gathering of linguistic evidence, selection of   (De Cao et al. 2008). DMs are intended to acquire
lexical examples as well as the annotation of the        semantic relationships between words, mainly by
targeted texts. This work is thus at the cross roads     looking at the word usage. The foundation for these
of linguistic theoretical investigation, corpus analy-   models is the Distributional Hypothesis (Harris
sis and natural language processing.                     1954), i.e. words that are used and occur in the
    On the one hand, the matching between LUs            same “contexts” tend to be semantically similar. A
and frames is always granted through manual lin-         context is a set of words appearing in the neighbor-
guistic validation applied to the data in the devel-     hood of a target predicate word (e.g. a LU). In this
opment stage. For every Frame the correctness of         sense, if two predicates share many contexts then
the inducted LUs is analysed and the ‘missing’           they can be considered similar in some way. Alt-
LUs, that is the BFN LUs’ translations, which are        hough different ways for modeling word semantics
absent in the inducted LU’s list, are detected.          exist (Sahlgren 2006; Pado and Lapata 2007;
    On the other hand, most choices rely on large        Mikolov et al. 2013; Pennington et al. 2014), they
sets of corpus examples, as made available by CO-        all derive vector representations for words from
RIS. Finally, the scaling to large sets of textual ex-   more or less complex processing stages of large-
amples is supported by automatically searching           scale text collections. This kind of approach is ad-
candidate items through semantic pre-filtering over      vantageous in that it enables the estimation of se-
the corpus: frame phenomena are here used as que-        mantic relationships in terms of vector similarity.
ries while intelligent retrieval and ranking methods     From a linguistic perspective, such vectors allow
are applied to the corpus material to minimize the       for some aspects of lexical semantics to be geo-
manual effort involved.                                  metrically modelled, and to provide a useful way
    In the following section, we will sketch the         to represent this information in a machine-readable
main stages of the process that integrate the above      format. Distributional methods can model different
paradigms.                                               semantic relationships, e.g. topical similarities (if
                                                         vectors are built considering the occurrence of a
2.1    Integrating corpus processing and lexical         word in documents) or paradigmatic similarities (if
       analysis for populating IFrameNet                 vectors are built considering the occurrence of a
                                                         word in the (short) contexts of another word
The beneficial contribution of the interaction be-       (Sahlgren 2006)). In such models, words like run
tween corpus processing techniques and lexical           and walk are close in the space, while run and read
analysis for the semi-automatic expansion of the         are likely to be projected in different subspaces.
FrameNet resource has been discussed since (Pen-         Here, we concentrate on DMs mainly devoted to
nacchiotti et al. 2008), where LU induction is pre-      modelling paradigmatic relationships, as we are
sented as the task of assigning a generic lexical unit   more interested in capturing phenomena of quasi
not yet present in the FrameNet database (the so-        synonymy, i.e. semantic similarity that tends to
called unknown LU) to the correct frame(s). The          preserve meaning.
number of possible classes (i.e. frames) and the
problem of multiple assignment make it a challeng-       2.2   The development cycle
ing task. This task is discussed in (Pennacchiotti et    In the following paragraphs, we outline the different
al. 2008, De Cao et al. 2008, Croce and Previtali        stages in the development process. Each stage cor-
2010), where different models combine distribu-          responds to specific computational processes.
           Figure 1: Three lexical clusters for the frames triggered by the verb abandon.v: pairs closed in the
                              map correspond to (paradigmatic) semantic similar words and frames

Validation of existing resources. At this stage,                  Lexical clustering is important here as specific
the existing resources, dating back to previous                space regions enclosing the instance vectors of
work, are analysed and manually pruned of errors               some considered LUs correspond to semantically
such as lexical units wrongly assigned to frames               coherent lexical subsets. This is a priming function
(e.g. ‘asta’ or ‘colmo’ to the Frame                           for mapping unseen word vectors to frames, as ap-
‘BODY_PARTS’), or words never assigned to their                plied in (De Cao et al. 2008): the centroids of the
correct frame, for instances the LU ‘piede’ or                 possibly multiple clusters generated by the known
‘mano’ for the Frame ‘BODY_PARTS’.                             LUs of a given frame f are used to detect all regions
                                                               expressing f and thus predict the predicate f over
    All the acquired Italian LUs have been com-
                                                               previously unseen words and sentences. Examples
pared, frame by frame, to BFN’s ones, using bilin-
                                                               of semantically coherent regions evoked by the
gual dictionaries (e.g. Oxford bilingual dictionary)
                                                               verb abandon for the English Framenet are report-
and WordNet in order to verify the correctness of
                                                               ed in Fig. 1. Here different lexical clusters for a
matching between lexical and frames. Over the
                                                               given frame (i.e. DEPARTING) are depicted while
15,134 automatically acquired ⟨LU, frame⟩ pairs
                                                               different frames (e.g. DEPARTING, QUIT-
(6,670 nouns and 8,464 verbs and adjectives),
                                                               TING_A_PLACE, COLLABORATION) are also evoked
7,377 LUs have been considered correctly assigned
                                                               by the verb. It should be noted that in the figure
(2,506 verb and adjective and 4,871 noun pairs).
                                                               distances in the two-dimensional plot correspond to
    In addition, bilingual dictionaries, ItalWordNet
                                                               distances between the word embedding vectors,
and MultiWordNet have been used to manually
                                                               while each lexical cluster is expressed as the cen-
insert a list of missing lexical entries for each
                                                               troid of its member vectors.
frame. At the end of the process, the resulting vali-
                                                                  The distributional information has been acquired
dated and refined ⟨LU,frame⟩ amount to 7,902
                                                               for the considered 7,902 LUs from CORIS and
(5,128 nouns and 2,774 verbs and adjectives).
                                                               used to support the LU mapping and the sentence
   Corpus processing and lexical modeling. At                  validation. In fact, given a sentence s containing a
this stage, the LUs made available from manual                 target LU l, a specific geometrical representation
validation are used to model distributionally the              for s can be derived by linearly combining all vec-
individual frames. Firstly, distributional corpus              tors representing words w surrounding l in sentence
analysis is applied to map individual LUs into dis-            s. This duality property allows the embedding
tributional vectors. A distributional model will be            space to represent sentences s, lexical units l as well
acquired from the CORIS corpus by applying the                 as generic words w. This enables to model the rele-
neural method presented in (Mikolov et al. 2013).              vance of a frame f for an incoming sentence s
It will enable the acquisition of geometrical repre-           through the distance d(f,s) between vectors f related
sentations for words in a high dimensional space               to a centroid for a frame f and the vector s of the
where distance reflects the paradigmatic relation              sentence s. It corresponds to a confidence measure
among words. This model can also be adopted to                 computed for a rule such as:
build a representation for sentences, as traditional-
                                                                   “s is a valid example of the usage of frame f ”
ly carried out by Distributional Semantic models,
e.g. (Landauer and Dumais 1997) or (Mitchell and                The open aspects of the above semi-automatic
Lapata, 2010).                                                  process are the following:
   I. How to design a suitable representation           3     Status of the Project and Perspective
      (centroid or model) for a frame f                       Views
  II. How to define the vector for a sentence s
 III. How to compute the distance function d(f,s)       Although the general software architecture for the
                                                        project progress is available, the overall process
  The current research activity is focusing on the      described above has not been fully accomplished.
best solution for these issues and part of our exper-       Current material covers a set of 554 frames and
imental activity is devoted to assess these design      7,902 lexical units, of which 2,604 verbs, 5,128
choices, as discussed in Section 3.                     nouns and 170 adjectives. The average number of
                                                        occurrences for each of these selected words is
First Lexical Analysis and Validation. A further        higher than 9,400, although there are still 508
stage for the resource development focuses on the       words not present in CORIS.
selection of a significant sample of LUs, chosen on         All these occurrences correspond to a number
the basis of their high semantic salience and for       of about 70 millions non validated and unsorted
their high number of occurrences in the corpus          sentences. In the rest of the paper, we describe the
(primary LUs). By relying on the method described       outcome of the First Lexical Analysis and Valida-
above, we use the distributional representation of      tion stage: its aim is to trigger the semi-automatic
words, lexical units and sentences, to gather CO-       learning and tagging of the whole corpus, accord-
RIS sentences s where a LU occurs and evaluate its      ing to the methods suggested in section 2.2.
suitability as an example for the evoked f. This de-
cision function is based on the geometric distance      3.1    Empirical Investigation: First Lexical
d(f,s) that can be computed over a large number of             Analysis and Validation
sentences s. When this step is carried out in CO-
                                                        The stage First Lexical Analysis and Validation has
RIS, the validation of the acquired candidate sen-
                                                        been currently accomplished. The three research
tences allows for positive examples of a frame f
                                                        questions posed above: (I) the modelling of a frame
to develop quickly: this is used to trigger super-
                                                        f, (II) the sentence representation and (III) the defi-
vised learning of f.
                                                        nition of a distance function able to model similari-
    The manually validation in fact confirms the
                                                        ty between sentences.
proper correspondence between automatically se-
                                                             About the problem (I) two approaches are pos-
lected sentences and LUs that evoke a targeted
                                                        sible. We can model a frame via clustering its lexi-
frame f. It produces novel seed examples for f: the-
                                                        cal units and applying the method described in
se will serve as a training set for a semi-automatic
                                                        (Pennacchiotti et al. 2008, De Cao et al. 2008). On
stage of resource expansion.
                                                        the contrary, we can adopt a supervised technique.
                                                        A frame f is represented as the target class of in-
Semi-automatic resource expansion. The ac-
                                                        stances corresponding to ⟨s,l⟩ pairs, where s is an
quired distributional model will support the semi-
                                                        input sentence and l is a lexical unit: a statistical
automatic expansion of the seed set, by selecting
                                                        classifier is trained to map ⟨s,l⟩ into a confidence
the most semantically similar word to the seed set
and assigning them to frames by applying the            value and its output h(s,l,f) corresponds to the sys-
methodologies suggested in (Pennacchiotti et al.        tem’s confidence that the sentence
2008, De Cao et al. 2008, Croce and Previtali                       “f is the frame evoked by l in s”
2010). Moreover, the same distributional model          is true. Notice that the pair ⟨s,l⟩ can be expressed
will support the assignment process of sentences to     as an instance by combining the embedding vector
frames. We will in fact investigate semi-supervised     l of its lexical unit l with a vector s for s.
models based on clustering techniques (Pennac-               As a solution for the problem (II) we define s as
chiotti et al. 2008) or other supervised approaches     the linear combination of vectors w, for each word
such as Support Vector Machines as in (Croce and
                                                        w in s, i.e. s = Σw s w .
Previtali 2010).                                                           ∈


                                                            The above formulation allows to define the
Final Validation and Release. The extracted sen-        classification task as follows:
tences will be ordered by decreasing probability,           Given a sentence s including a word l as a po-
according to their distributional collocation, and a    tential frame evoking LU, Find the frame f that
list of 15 to 20 candidates per LU will be provided.    characterizes l in s.
This list will be manually validated. The aim is to         The solution of the above problem over a ⟨s,l⟩
provide at least 4 sample sentences for each of the     pair would also be a useful solution for the problem
primary LUs.                                            (III), as the confidence h(s,l,f) in the classification
of a sentence s in a frame f for l can be retained as       uation over a set of 3261 frames, the ones with
the inverse of the target distance function d(f,l) lo-      more than 5 lexical units in the initial lexicon. In
cal to the sentence.                                        this way, we selected 1,095 different LUs, repre-
    The major problem with the above formulation            sented as an embedding vector in the wordspace.
is that the training of the statistical classifier is not   On average, we have 12 LU per frame, and every
possible without the availability of useful examples        individual lexical entry l appears in about 1.88
of different frame f. The idea is thus to develop           frames. The baseline of a classification task that
ways to derive from CORIS the proper candidates s           maps a sentence s including a lexical unit into its
for f through the knowledge of some of its LUs. In          own frame is about 35%, as for the ambiguity char-
the bootstrapping stage, we define as virtual exam-         acterizing most frequent entries.
ples the pairs ⟨l,{l}⟩ that are retained as positive        We asked three annotators to evaluate individual
examples for the frame f, for every l that is a known       triples ⟨l, s, f⟩ validating the system proposal. Four
lexical unit for f. In our approach, an example is          main cases where possible:
thus obtained by modelling the sentence s as a sin-          • MISSING FRAME. The sentence s is not mani-
gleton {l}, i.e. the lexical unit l.                            festing any of the frames f evoked by the lexi-
    A statistical classifier considers every known              cal unit l, but corresponds to a frame not yet
LU as an individual (positive) example and can be               present in the lexicon for l. In this case the algo-
applied to every LU in our initial resource (i.e.               rithm cannot provide the suitable frame, as it
7,902 for the 554 frames).                                      cannot generate a novel frame.
    In synthesis, the method works as follows. First,        • NOT APPLICABLE. The sentence s does not con-
for every lemma w in the corpus, an n-dimensional               tain an occurrence of the lexical unit l in one of
embedding vector w is derived, according to                     its proper senses: this case is typical for phrase-
(Mikolov et al. 2013). As a side effect, for every              ological uses of a verb such as morire di freddo,
LU l of each known frame f, the lexical embedding               andare di fretta, … that do not directly corre-
vector l is used to build the example (l, l) for the            spond to lexical predicates and thus cannot be
LU sentence pair: ⟨l, {l}⟩.                                     treated through the lexical embedding vectors.
    A multiclass-statistical categorizer is trained for      • CORRECT/INCORRECT, when the outcome
every frame f for which at least 5 examples (i.e. 5             argmaxf’ { h(l,s,f’) } is correct (or incorrect) as
different LUs) where available.                                 the frame evoked by l in s is exactly (or not) f.
    When applied to an incoming sentence s includ-          According to the above method annotators validat-
ing a LU l, the classifier outcome h(l,s, f) is said to     ed 667 sentences for 113 frames and 212 different
accept the frame f if:                                      verbal lexical units. The analysis resulted into a
   • f belongs to the set of frames evoked by l             precision (i.e. the number of correct candidate
   • f = argmaxf’ { h(l,s,f’) }                             frames emitted by the algorithm w.r.t. the number
                                                            of valid cases, that is all but the MISSING FRAME or
For every sentence s including a frame evoking              NOT APPLICABLE cases) is 75,2%, well beyond the
lexical unit l, the above function suggests one can-        35% baseline. The method could be applied onto
didate frame among the possibly multiple ones.              the 74,5% of the sentences, including CORRECT
When the scoring function h is negative every-              cases and MISSING FRAME cases. We neglected in
where (e.g. with the SVM formulation of a classifi-         this coverage score the NOT APPLICABLE cases that
cation task), the sentence is rejected and is not con-      amount to 44 sentences, i.e. about 6,4%.
sidered a valid example for future iterations.
                                                            Examples of the correct assignment of the algo-
    The application of this method to the CORIS             rithm on quite ambiguous verbs, such as finire (i.e.
corpus has been carried out applying a multi-               to    end,     in    frames    ACTIVITY_FINISH,
classifier SVM with linear kernel to the 2n-                CAUSE_TO_END and KILLING) or rivelare (i.e. to
dimensional vectors of each pair ⟨l, {l}⟩. Starting         reveal, in frames REVEAL_SECRET, OMEN, EVI-
from the lexicon validated in the first stages, the         DENCE) are the following:
SVM has been able to label over 2 million sentenc-
                                                            La vicenda avrebbe potuto [finire]ACTIVITY_FINISH lì , ma il prefetto
es.
                                                            di Nuoro fece presentare ...
3.2    Empirical Investigation: Current Results             In prova si è [rivelato]EVIDENCE ad altissimo livello sia sull'
                                                            asciutto sia sul ...
In order to evaluate the proposed supervised classi-
fication method for the stage “First Lexical Analy-
sis and Validation” we run and experimental eval-           1
                                                             By keeping the frames that include at least 4 lexical units the
                                                            number of targeted frames grows to 371.
An example of Missing Frame is BEAT_OPPONENT                        Johnson, M. And Lenci, A. (2011). Verbs of visual per-
for the verb battere in                                                ception in Italian FrameNet, Advances in Frame Se-
                                                                       mantics, 3(1), 9–45
... impegnato a fornire quante più informazioni possibili, anche
per [battere]BEAT_OPPONENT la concorrenza dei siti Ipsoa e il ...   Landauer, T. and Dumais, S. (1997). A solution to Pla-
                                                                      to’s problem: The latent semantic analysis theory of
as the lexicon of the verb battere only includes the                  acquisition, induction and representation of
frames CAUSE_HARM, CORPORAL_PUNISHMENT                                knowledge. Psychological Review, 104.
and EXPERIENCE_BODI-LY_HARM.                                        Lenci, A, Johnson, M, Lapesa, G. (2010). Building an
    The experiments only run over verbal lexical                      Italian FrameNet through Semi-automatic Corpus
units will be extended soon to nouns and adjec-                       Analysis. Proceedings of LREC 2010. Malta.
tives. However, the encouraging precision reached
                                                                    Lenci, A., Montemagni, S., Venturi, G, Cutrullà, M. G.
by the method allows for direct use it in an iterative
                                                                      (2012). Enriching the ISST-TANL Corpus with
active learning schema, where the more ambiguous                      Semantic Frames in Proceedings of LREC 2012,
sentences found and annotated within a specific                       Istanbul, Turkey
training stage are used to train the system at the
                                                                    Mikolov, T., Kai Chen, Greg Corrado, and Jeffrey Dean.
next stage. We expect this to speed up the lexicon
                                                                      (2013). Efficient estimation of word representations
development process and to allow bootstrapping                        in      vector    space.    CoRR     abs/1301.3781.
with fewer resources. The lexicon will be made                        http://arxiv.org/abs/1301.3781.
available for crowdsourcing further annotations and
                                                                    Mitchell, J. and Lapata, M. (2010). Composition in dis-
delivered incrementally in the next few months.
                                                                      tributional models of semantics. Cognitive Science,
                                                                      34(8):1388–1429.
References
                                                                    Pado, S. and Lapata, M. (2007). Dependency-based con-
Baker C. F., Fillmore, C. J., Lowe, J. B.. (1998). The                struction of semantic space models. Computational
  Berkeley FrameNet project. In: COLING '98                           Linguistics, 33(2):161–199.
  Proceedings of COLING '98, 1. Canada, 86-90.
                                                                    Pennacchiotti M., De Cao D., Basili R., Croce D., Roth
Basili R., De Cao D., Croce D., Coppola B., Moschitti                 M. (2008). Automatic induction of FrameNet lexical
  A. (2009). Cross-Language Frame Semantics                           units. In: Proceedings of the EMNLP 2008, Hawaii
  Transfer in Parallel Corpora. In: Proceedings of the
  CICLing 2009, Best Paper Award. Mexico                            Pennington, J., Socher, R. and Manning, C. (2014).
                                                                      GloVe: Global Vectors for Word Representation, In
Bastianelli, E., Castellucci, G., Croce, D., Iocchi, L.,              Proceedings of EMNLP 2014, 1532-1543.
  Basili, R., & Nardi, D. (2014). HuRIC: a Human Ro-
  bot Interaction Corpus. In Proceeings of LREC 2014,               Rossini Favretti R., Tamburini F., De Santis C. (2002).
  4519-4526.                                                          CORIS/CODIS: A corpus of written Italian based on
                                                                      a defined and a dynamic model. In A Rainbow of
Croce, D. and Previtali, D. (2010). Manifold learning for             Corpora: Corpus Linguistics and the Languages of the
  the semi-supervised induction of framenet predicates:               World, Lincom-Europa, Munich, 27-38.
  an empirical investigation. In Proceedings of GEMS
  ’10, pages 7–16, Stroudsburg, PA, USA.                            Sahlgren, M.. (2006). The Word-Space Model. Ph.D.
                                                                      thesis, Stockholm University.
De Cao D., Croce D., Pennacchiotti M., Basili R. (2008).
  Combining word sense and usage for modeling frame                 Tonelli, S. and Pianta, E. (2008). Frame information
  semantics. In Proceedings of STEP 2008, Italy                       transfer from English to Italian. In Proceedings of
                                                                      LREC, Marrekech, Morocco
De Cao D., Croce D., Basili R. (2010). Extensive
  Evaluation of a FrameNet-WordNet mapping                          Tonelli, S, Pighin, D, Giuliano, C, Pianta, E. (2009).
  resource. In: Proceedings of the LREC 2010, Malta.                  Semi-automatic Development of FrameNet for Ital-
                                                                      ian. In Proceedings of the FrameNet Workshop and
Fillmore, C.J. (1985). Frames and the semantics of                    Masterclass, Milano, Italy. Milan, Italy
   understanding. Quaderni di Semantica, VI(2), 222-254.
                                                                    Tonelli, S. and Pighin, D. (2009). ‘New Features for
Fillmore, Charles J. (1976). Frame semantics and the                  FrameNet - WordNet Mapping’, in Proceedings of
   nature of language, Annals of the New York Acade-                  CoNLL 2009, Boulder, Colorado, 219–227
   my of Sciences: Conference on the Origin and Devel-
   opment of Language and Speech, vol. 280, pp. 20-32               Tonelli, S. (2010). “Semi-automatic techniques for ex-
                                                                      tending the FrameNet lexical database to new lan-
Fillmore, C. J. (1982). Frame semantics. Linguistics in               guages”, Università Ca’ Foscari, Venezia
   the morning calm, pp. 111-137.
                                                                    Venturi G., Lenci A., Montemagni S., Vecchi E., Sagri
Harris, Z. (1954). Distributional structure. In Jerrold J.            M., Tiscornia D., Agnoloni T. (2009). Towards a
  Katz and Jerry A. Fodor, editors, The Philosophy of                 FrameNet Resource for the Legal Domain. In
  Linguistics, New York. Oxford University Press.                     Proceedings of LOAIT 2009. Barcelona, Spain