<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Developing a large scale FrameNet for Italian: the IFrameNet experience</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Roberto Basili°</string-name>
          <email>basili@info.uniroma2.it</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Silvia Brambilla</string-name>
          <email>silvia.brambilla@studio.unibo.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Danilo Croce°</string-name>
          <email>croce@info.uniroma2.it</email>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dept. of Classic Philology and Italian Studies University of Bologna</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>English. This paper presents work in progress for the development of IFrameNet, a large-scale, computationally oriented, lexical resource based on Fillmore's frame semantics for Italian. For the development of IFrameNet linguistic analysis, corpusprocessing and machine learning techniques are combined in order to support the semiautomatic development and annotation of the resource. Italiano. Questo articolo presenta un work in progress per lo sviluppo di IFrameNet, una risorsa lessicale ad ampia copertura, computazionalmente orientata, basata sulle teorie di Semantica dei Frame proposte da Fillmore. Per lo sviluppo di IFrameNet sono combinate analisi linguistica, corpusprocessing e tecniche di machine learning al fine di semi-automatizzare lo sviluppo della risorsa e il processo di annotazione.</p>
      </abstract>
      <kwd-group>
        <kwd>Fabio Tamburini§</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Firstly developed at the University of Berkeley
(California) in 1997, FrameNet adopts theories
from Frame Semantics
        <xref ref-type="bibr" rid="ref7 ref8 ref9">(Fillmore 1976, 1982,
1985)</xref>
        to NLP and explains words’ meanings
according to the semantic frames they evoke. It
illustrates semantic frames (i.e. schematizations of
prototypical events, relations or entities in the reality),
through the involved participants (called frame
elements, FEs) and the evoking words (or, better, the
lexical units, LUs). Moreover, FrameNet aims to
give a valence representation of the lexical units
and underline the relations between frames and
between frame elements
        <xref ref-type="bibr" rid="ref1">(Baker et al. 1998)</xref>
        .
      </p>
      <p>The initial American project has since been
extended to other languages: French, Chinese,
Brazilian Portuguese, German, Spanish, Japanese,
Swedish and Korean.</p>
      <p>
        All these projects are based on the idea that
most of the Frames are the same among languages
and that, thanks to this, it is possible to adopt
Berkeley’s Frames and FEs and their relations,
with few changes, once all the language-specific
information has been cut away
        <xref ref-type="bibr" rid="ref23 ref24 ref25">(Tonelli et al. 2009,
Tonelli 2010)</xref>
        .
      </p>
      <p>
        With regard to Italian, over the past ten years
several research projects have been carried out at
different universities and Research Centres. In
particular, the ILC-CNR in Pisa
        <xref ref-type="bibr" rid="ref11 ref22">(e.g. Lenci et al. 2008;
Johnson and Lenci 2011)</xref>
        , FBK in Trento
        <xref ref-type="bibr" rid="ref23 ref24 ref25 ref26">(e.g.
Tonelli et al. 2009, Tonelli 2010)</xref>
        and the
University of Rome, Tor Vergata
        <xref ref-type="bibr" rid="ref18 ref2 ref22 ref26 ref5">(e.g. Pennacchiotti et al.
2008, Basili et al. 2009)</xref>
        proposed automatic or
semiautomatic methods to develop an Italian
FrameNet. However, as of today, a resource even
remotely equivalent to Berkeley’s FrameNet (BFN)
is still missing.
      </p>
      <p>
        As a lexical resource of this kind is useful in
many computational applications (such as
HumanRobot interaction), a new effort is currently being
jointly made at the universities of Bologna and
Roma, Tor Vergata. The IFrameNet project aims to
develop a large-coverage FrameNet-like resource
for Italian, relying on robust and scalable methods,
in which the automatic corpus processing is
consistently integrated with manual lexical analysis. It
builds upon the achievements of previous projects
that automatically harvested FrameNet LUs
exploiting both distributional and WordNet based
models
        <xref ref-type="bibr" rid="ref18 ref5">(Pennacchiotti et al. 2008)</xref>
        . Since the LUs
induction is a noisy process, the data thus obtained
need to be manually refined and validated.
      </p>
      <p>
        The aim is also to provide Sample Sentences for
LUs with the highest corpus frequency. On the one
side, they will be derived from already existing
resources such as the HuRIC corpus
        <xref ref-type="bibr" rid="ref3">(Bastianelli
2014)</xref>
        or the EvalIta2011 FLaIT task data: FBK set
        <xref ref-type="bibr" rid="ref22">(Tonelli, Pianta 2008)</xref>
        and ILC set
        <xref ref-type="bibr" rid="ref14">(Lenci et al.
2012)</xref>
        . On the other side, candidate sentences will
also be extracted through semi-automatic
distributional analysis of a large corpus - i.e. CORIS
        <xref ref-type="bibr" rid="ref20">(Rossini Favretti et al. 2002)</xref>
        - and refined through
linguistic analysis and manual validation of data thus
obtained.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>The development of the large scale</title>
    </sec>
    <sec id="sec-3">
      <title>IFrameNet resource</title>
      <p>The need for a large-scale resource cannot be
satisfied without resorting to a semi-automatic process
for the gathering of linguistic evidence, selection of
lexical examples as well as the annotation of the
targeted texts. This work is thus at the cross roads
of linguistic theoretical investigation, corpus
analysis and natural language processing.</p>
      <p>On the one hand, the matching between LUs
and frames is always granted through manual
linguistic validation applied to the data in the
development stage. For every Frame the correctness of
the inducted LUs is analysed and the ‘missing’
LUs, that is the BFN LUs’ translations, which are
absent in the inducted LU’s list, are detected.</p>
      <p>On the other hand, most choices rely on large
sets of corpus examples, as made available by
CORIS. Finally, the scaling to large sets of textual
examples is supported by automatically searching
candidate items through semantic pre-filtering over
the corpus: frame phenomena are here used as
queries while intelligent retrieval and ranking methods
are applied to the corpus material to minimize the
manual effort involved.</p>
      <p>In the following section, we will sketch the
main stages of the process that integrate the above
paradigms.
2.1</p>
      <sec id="sec-3-1">
        <title>Integrating corpus processing and lexical analysis for populating IFrameNet</title>
        <p>
          The beneficial contribution of the interaction
between corpus processing techniques and lexical
analysis for the semi-automatic expansion of the
FrameNet resource has been discussed since
          <xref ref-type="bibr" rid="ref18 ref5">(Pennacchiotti et al. 2008)</xref>
          , where LU induction is
presented as the task of assigning a generic lexical unit
not yet present in the FrameNet database (the
socalled unknown LU) to the correct frame(s). The
number of possible classes (i.e. frames) and the
problem of multiple assignment make it a
challenging task. This task is discussed in
          <xref ref-type="bibr" rid="ref16 ref18 ref22 ref4 ref5 ref6">(Pennacchiotti et
al. 2008, De Cao et al. 2008, Croce and Previtali
2010)</xref>
          , where different models combine
distributional and paradigmatic lexical information (i.e.
derived from WordNet) to assign unknown LUs to
frames. In particular, distributional models are used
to select a list of frames suggested by the corpus’
evidence and then the plausible lexical senses of
the unknown LU are used to re-rank proposed
frames.
        </p>
        <p>
          In order to rely on comparable representations
for LUs and sentences for transferring semantic
information from the former to the latter, we
exploit Distributional Models (DM) of Lexical
Semantics, in line with
          <xref ref-type="bibr" rid="ref18 ref5">(Pennacchiotti et al. 2008)</xref>
          and
          <xref ref-type="bibr" rid="ref18 ref5">(De Cao et al. 2008)</xref>
          . DMs are intended to acquire
semantic relationships between words, mainly by
looking at the word usage. The foundation for these
models is the Distributional Hypothesis
          <xref ref-type="bibr" rid="ref10">(Harris
1954)</xref>
          , i.e. words that are used and occur in the
same “contexts” tend to be semantically similar. A
context is a set of words appearing in the
neighborhood of a target predicate word (e.g. a LU). In this
sense, if two predicates share many contexts then
they can be considered similar in some way.
Although different ways for modeling word semantics
exist
          <xref ref-type="bibr" rid="ref15 ref17 ref19 ref21">(Sahlgren 2006; Pado and Lapata 2007;
Mikolov et al. 2013; Pennington et al. 2014)</xref>
          , they
all derive vector representations for words from
more or less complex processing stages of
largescale text collections. This kind of approach is
advantageous in that it enables the estimation of
semantic relationships in terms of vector similarity.
From a linguistic perspective, such vectors allow
for some aspects of lexical semantics to be
geometrically modelled, and to provide a useful way
to represent this information in a machine-readable
format. Distributional methods can model different
semantic relationships, e.g. topical similarities (if
vectors are built considering the occurrence of a
word in documents) or paradigmatic similarities (if
vectors are built considering the occurrence of a
word in the (short) contexts of another word
          <xref ref-type="bibr" rid="ref21">(Sahlgren 2006)</xref>
          ). In such models, words like run
and walk are close in the space, while run and read
are likely to be projected in different subspaces.
Here, we concentrate on DMs mainly devoted to
modelling paradigmatic relationships, as we are
more interested in capturing phenomena of quasi
synonymy, i.e. semantic similarity that tends to
preserve meaning.
2.2
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>The development cycle</title>
        <p>In the following paragraphs, we outline the different
stages in the development process. Each stage
corresponds to specific computational processes.
Validation of existing resources. At this stage,
the existing resources, dating back to previous
work, are analysed and manually pruned of errors
such as lexical units wrongly assigned to frames
(e.g. ‘asta’ or ‘colmo’ to the Frame
‘BODY_PARTS’), or words never assigned to their
correct frame, for instances the LU ‘piede’ or
‘mano’ for the Frame ‘BODY_PARTS’.</p>
        <p>All the acquired Italian LUs have been
compared, frame by frame, to BFN’s ones, using
bilingual dictionaries (e.g. Oxford bilingual dictionary)
and WordNet in order to verify the correctness of
matching between lexical and frames. Over the
15,134 automatically acquired ⟨LU, frame⟩ pairs
(6,670 nouns and 8,464 verbs and adjectives),
7,377 LUs have been considered correctly assigned
(2,506 verb and adjective and 4,871 noun pairs).</p>
        <p>In addition, bilingual dictionaries, ItalWordNet
and MultiWordNet have been used to manually
insert a list of missing lexical entries for each
frame. At the end of the process, the resulting
validated and refined ⟨LU,frame⟩ amount to 7,902
(5,128 nouns and 2,774 verbs and adjectives).</p>
        <p>
          Corpus processing and lexical modeling. At
this stage, the LUs made available from manual
validation are used to model distributionally the
individual frames. Firstly, distributional corpus
analysis is applied to map individual LUs into
distributional vectors. A distributional model will be
acquired from the CORIS corpus by applying the
neural method presented in
          <xref ref-type="bibr" rid="ref15">(Mikolov et al. 2013)</xref>
          .
It will enable the acquisition of geometrical
representations for words in a high dimensional space
where distance reflects the paradigmatic relation
among words. This model can also be adopted to
build a representation for sentences, as
traditionally carried out by Distributional Semantic models,
e.g.
          <xref ref-type="bibr" rid="ref12">(Landauer and Dumais 1997)</xref>
          or
          <xref ref-type="bibr" rid="ref16 ref4">(Mitchell and
Lapata, 2010)</xref>
          .
        </p>
        <p>
          Lexical clustering is important here as specific
space regions enclosing the instance vectors of
some considered LUs correspond to semantically
coherent lexical subsets. This is a priming function
for mapping unseen word vectors to frames, as
applied in
          <xref ref-type="bibr" rid="ref18 ref5">(De Cao et al. 2008)</xref>
          : the centroids of the
possibly multiple clusters generated by the known
LUs of a given frame f are used to detect all regions
expressing f and thus predict the predicate f over
previously unseen words and sentences. Examples
of semantically coherent regions evoked by the
verb abandon for the English Framenet are
reported in Fig. 1. Here different lexical clusters for a
given frame (i.e. DEPARTING) are depicted while
different frames (e.g. DEPARTING,
QUITTING_A_PLACE, COLLABORATION) are also evoked
by the verb. It should be noted that in the figure
distances in the two-dimensional plot correspond to
distances between the word embedding vectors,
while each lexical cluster is expressed as the
centroid of its member vectors.
        </p>
        <p>The distributional information has been acquired
for the considered 7,902 LUs from CORIS and
used to support the LU mapping and the sentence
validation. In fact, given a sentence s containing a
target LU l, a specific geometrical representation
for s can be derived by linearly combining all
vectors representing words w surrounding l in sentence
s. This duality property allows the embedding
space to represent sentences s, lexical units l as well
as generic words w. This enables to model the
relevance of a frame f for an incoming sentence s
through the distance d(f,s) between vectors f related
to a centroid for a frame f and the vector s of the
sentence s. It corresponds to a confidence measure
computed for a rule such as:</p>
        <p>“s is a valid example of the usage of frame f ”
The open aspects of the above semi-automatic
process are the following:</p>
        <p>How to design a suitable representation
(centroid or model) for a frame f
How to define the vector for a sentence s</p>
        <p>How to compute the distance function d(f,s)
The current research activity is focusing on the
best solution for these issues and part of our
experimental activity is devoted to assess these design
choices, as discussed in Section 3.</p>
        <p>First Lexical Analysis and Validation. A further
stage for the resource development focuses on the
selection of a significant sample of LUs, chosen on
the basis of their high semantic salience and for
their high number of occurrences in the corpus
(primary LUs). By relying on the method described
above, we use the distributional representation of
words, lexical units and sentences, to gather
CORIS sentences s where a LU occurs and evaluate its
suitability as an example for the evoked f. This
decision function is based on the geometric distance
d(f,s) that can be computed over a large number of
sentences s. When this step is carried out in
CORIS, the validation of the acquired candidate
sentences allows for positive examples of a frame f
to develop quickly: this is used to trigger
supervised learning of f.</p>
        <p>The manually validation in fact confirms the
proper correspondence between automatically
selected sentences and LUs that evoke a targeted
frame f. It produces novel seed examples for f:
these will serve as a training set for a semi-automatic
stage of resource expansion.</p>
        <p>
          Semi-automatic resource expansion. The
acquired distributional model will support the
semiautomatic expansion of the seed set, by selecting
the most semantically similar word to the seed set
and assigning them to frames by applying the
methodologies suggested in
          <xref ref-type="bibr" rid="ref16 ref18 ref22 ref4 ref5 ref6">(Pennacchiotti et al.
2008, De Cao et al. 2008, Croce and Previtali
2010)</xref>
          . Moreover, the same distributional model
will support the assignment process of sentences to
frames. We will in fact investigate semi-supervised
models based on clustering techniques
          <xref ref-type="bibr" rid="ref18 ref5">(Pennacchiotti et al. 2008)</xref>
          or other supervised approaches
such as Support Vector Machines as in
          <xref ref-type="bibr" rid="ref16 ref4 ref6">(Croce and
Previtali 2010)</xref>
          .
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Final Validation and Release. The extracted sen</title>
        <p>tences will be ordered by decreasing probability,
according to their distributional collocation, and a
list of 15 to 20 candidates per LU will be provided.
This list will be manually validated. The aim is to
provide at least 4 sample sentences for each of the
primary LUs.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Status of the Project and Perspective</title>
    </sec>
    <sec id="sec-5">
      <title>Views</title>
      <p>Although the general software architecture for the
project progress is available, the overall process
described above has not been fully accomplished.</p>
      <p>Current material covers a set of 554 frames and
7,902 lexical units, of which 2,604 verbs, 5,128
nouns and 170 adjectives. The average number of
occurrences for each of these selected words is
higher than 9,400, although there are still 508
words not present in CORIS.</p>
      <p>All these occurrences correspond to a number
of about 70 millions non validated and unsorted
sentences. In the rest of the paper, we describe the
outcome of the First Lexical Analysis and
Validation stage: its aim is to trigger the semi-automatic
learning and tagging of the whole corpus,
according to the methods suggested in section 2.2.
3.1</p>
      <sec id="sec-5-1">
        <title>Empirical Investigation: First Lexical</title>
      </sec>
      <sec id="sec-5-2">
        <title>Analysis and Validation</title>
        <p>The stage First Lexical Analysis and Validation has
been currently accomplished. The three research
questions posed above: (I) the modelling of a frame
f, (II) the sentence representation and (III) the
definition of a distance function able to model
similarity between sentences.</p>
        <p>
          About the problem (I) two approaches are
possible. We can model a frame via clustering its
lexical units and applying the method described in
          <xref ref-type="bibr" rid="ref18 ref5">(Pennacchiotti et al. 2008, De Cao et al. 2008)</xref>
          . On
the contrary, we can adopt a supervised technique.
A frame f is represented as the target class of
instances corresponding to ⟨s,l⟩ pairs, where s is an
input sentence and l is a lexical unit: a statistical
classifier is trained to map ⟨s,l⟩ into a confidence
value and its output h(s,l,f) corresponds to the
system’s confidence that the sentence
        </p>
        <p>“f is the frame evoked by l in s”
is true. Notice that the pair ⟨s,l⟩ can be expressed
as an instance by combining the embedding vector
l of its lexical unit l with a vector s for s.</p>
        <p>As a solution for the problem (II) we define s as
the linear combination of vectors w, for each word
w in s, i.e. s = Σw∈s w .</p>
        <p>The above formulation allows to define the
classification task as follows:</p>
        <p>Given a sentence s including a word l as a
potential frame evoking LU, Find the frame f that
characterizes l in s.</p>
        <p>The solution of the above problem over a ⟨s,l⟩
pair would also be a useful solution for the problem
(III), as the confidence h(s,l,f) in the classification
of a sentence s in a frame f for l can be retained as
the inverse of the target distance function d(f,l)
local to the sentence.</p>
        <p>The major problem with the above formulation
is that the training of the statistical classifier is not
possible without the availability of useful examples
of different frame f. The idea is thus to develop
ways to derive from CORIS the proper candidates s
for f through the knowledge of some of its LUs. In
the bootstrapping stage, we define as virtual
examples the pairs ⟨l,{l}⟩ that are retained as positive
examples for the frame f, for every l that is a known
lexical unit for f. In our approach, an example is
thus obtained by modelling the sentence s as a
singleton {l}, i.e. the lexical unit l.</p>
        <p>A statistical classifier considers every known
LU as an individual (positive) example and can be
applied to every LU in our initial resource (i.e.
7,902 for the 554 frames).</p>
        <p>
          In synthesis, the method works as follows. First,
for every lemma w in the corpus, an n-dimensional
embedding vector w is derived, according to
          <xref ref-type="bibr" rid="ref15">(Mikolov et al. 2013)</xref>
          . As a side effect, for every
LU l of each known frame f, the lexical embedding
vector l is used to build the example (l, l) for the
LU sentence pair: ⟨l, {l}⟩.
        </p>
        <p>A multiclass-statistical categorizer is trained for
every frame f for which at least 5 examples (i.e. 5
different LUs) where available.</p>
        <p>When applied to an incoming sentence s
including a LU l, the classifier outcome h(l,s, f) is said to
accept the frame f if:
• f belongs to the set of frames evoked by l
• f = argmaxf’ { h(l,s,f’) }
For every sentence s including a frame evoking
lexical unit l, the above function suggests one
candidate frame among the possibly multiple ones.
When the scoring function h is negative
everywhere (e.g. with the SVM formulation of a
classification task), the sentence is rejected and is not
considered a valid example for future iterations.</p>
        <p>The application of this method to the CORIS
corpus has been carried out applying a
multiclassifier SVM with linear kernel to the
2ndimensional vectors of each pair ⟨l, {l}⟩. Starting
from the lexicon validated in the first stages, the
SVM has been able to label over 2 million
sentences.
3.2</p>
      </sec>
      <sec id="sec-5-3">
        <title>Empirical Investigation: Current Results</title>
        <p>In order to evaluate the proposed supervised
classification method for the stage “First Lexical
Analysis and Validation” we run and experimental
evaluation over a set of 3261 frames, the ones with
more than 5 lexical units in the initial lexicon. In
this way, we selected 1,095 different LUs,
represented as an embedding vector in the wordspace.
On average, we have 12 LU per frame, and every
individual lexical entry l appears in about 1.88
frames. The baseline of a classification task that
maps a sentence s including a lexical unit into its
own frame is about 35%, as for the ambiguity
characterizing most frequent entries.</p>
        <p>We asked three annotators to evaluate individual
triples ⟨l, s, f⟩ validating the system proposal. Four
main cases where possible:
• MISSING FRAME. The sentence s is not
manifesting any of the frames f evoked by the
lexical unit l, but corresponds to a frame not yet
present in the lexicon for l. In this case the
algorithm cannot provide the suitable frame, as it
cannot generate a novel frame.
• NOT APPLICABLE. The sentence s does not
contain an occurrence of the lexical unit l in one of
its proper senses: this case is typical for
phraseological uses of a verb such as morire di freddo,
andare di fretta, … that do not directly
correspond to lexical predicates and thus cannot be
treated through the lexical embedding vectors.
• CORRECT/INCORRECT, when the outcome
argmaxf’ { h(l,s,f’) } is correct (or incorrect) as
the frame evoked by l in s is exactly (or not) f.
According to the above method annotators
validated 667 sentences for 113 frames and 212 different
verbal lexical units. The analysis resulted into a
precision (i.e. the number of correct candidate
frames emitted by the algorithm w.r.t. the number
of valid cases, that is all but the MISSING FRAME or
NOT APPLICABLE cases) is 75,2%, well beyond the
35% baseline. The method could be applied onto
the 74,5% of the sentences, including CORRECT
cases and MISSING FRAME cases. We neglected in
this coverage score the NOT APPLICABLE cases that
amount to 44 sentences, i.e. about 6,4%.</p>
        <p>Examples of the correct assignment of the
algorithm on quite ambiguous verbs, such as finire (i.e.
to end, in frames ACTIVITY_FINISH,
CAUSE_TO_END and KILLING) or rivelare (i.e. to
reveal, in frames REVEAL_SECRET, OMEN,
EVIDENCE) are the following:
La vicenda avrebbe potuto [finire]ACTIVITY_FINISH lì , ma il prefetto
di Nuoro fece presentare ...</p>
        <p>In prova si è [rivelato]EVIDENCE ad altissimo livello sia sull'
asciutto sia sul ...
1 By keeping the frames that include at least 4 lexical units the
number of targeted frames grows to 371.</p>
        <p>An example of Missing Frame is BEAT_OPPONENT
for the verb battere in
... impegnato a fornire quante più informazioni possibili, anche
per [battere]BEAT_OPPONENT la concorrenza dei siti Ipsoa e il ...
as the lexicon of the verb battere only includes the
frames CAUSE_HARM, CORPORAL_PUNISHMENT
and EXPERIENCE_BODI-LY_HARM.</p>
        <p>The experiments only run over verbal lexical
units will be extended soon to nouns and
adjectives. However, the encouraging precision reached
by the method allows for direct use it in an iterative
active learning schema, where the more ambiguous
sentences found and annotated within a specific
training stage are used to train the system at the
next stage. We expect this to speed up the lexicon
development process and to allow bootstrapping
with fewer resources. The lexicon will be made
available for crowdsourcing further annotations and
delivered incrementally in the next few months.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Baker C. F.</given-names>
            ,
            <surname>Fillmore</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. J.</given-names>
            ,
            <surname>Lowe</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. B..</surname>
          </string-name>
          (
          <year>1998</year>
          ).
          <article-title>The Berkeley FrameNet project</article-title>
          .
          <source>In: COLING '98 Proceedings of COLING '98</source>
          ,
          <fpage>1</fpage>
          . Canada,
          <volume>86</volume>
          -
          <fpage>90</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Basili R.</given-names>
            , De Cao D.,
            <surname>Croce</surname>
          </string-name>
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Coppola</surname>
          </string-name>
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Moschitti</surname>
          </string-name>
          <string-name>
            <surname>A.</surname>
          </string-name>
          (
          <year>2009</year>
          ).
          <article-title>Cross-Language Frame Semantics Transfer in Parallel Corpora</article-title>
          .
          <source>In: Proceedings of the CICLing</source>
          <year>2009</year>
          , Best Paper Award. Mexico
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Bastianelli</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Castellucci</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Croce</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Iocchi</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Basili</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Nardi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>HuRIC: a Human Robot Interaction Corpus</article-title>
          .
          <source>In Proceeings of LREC</source>
          <year>2014</year>
          ,
          <volume>4519</volume>
          -
          <fpage>4526</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Croce</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Previtali</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Manifold learning for the semi-supervised induction of framenet predicates: an empirical investigation</article-title>
          .
          <source>In Proceedings of GEMS '10</source>
          , pages
          <fpage>7</fpage>
          -
          <lpage>16</lpage>
          , Stroudsburg, PA, USA.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>De Cao D.</surname>
          </string-name>
          ,
          <string-name>
            <surname>Croce</surname>
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pennacchiotti</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Basili</surname>
            <given-names>R.</given-names>
          </string-name>
          (
          <year>2008</year>
          ).
          <article-title>Combining word sense and usage for modeling frame semantics</article-title>
          .
          <source>In Proceedings of STEP</source>
          <year>2008</year>
          , Italy
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>De Cao D.</surname>
          </string-name>
          ,
          <string-name>
            <surname>Croce</surname>
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Basili</surname>
            <given-names>R.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Extensive Evaluation of a FrameNet-WordNet mapping resource</article-title>
          .
          <source>In: Proceedings of the LREC</source>
          <year>2010</year>
          , Malta.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Fillmore</surname>
            ,
            <given-names>C.J.</given-names>
          </string-name>
          (
          <year>1985</year>
          ).
          <article-title>Frames and the semantics of understanding</article-title>
          .
          <source>Quaderni di Semantica</source>
          ,
          <source>VI(2)</source>
          ,
          <fpage>222</fpage>
          -
          <lpage>254</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Fillmore</surname>
            ,
            <given-names>Charles J</given-names>
          </string-name>
          . (
          <year>1976</year>
          ).
          <article-title>Frame semantics and the nature of language</article-title>
          ,
          <source>Annals of the New York Academy of Sciences: Conference on the Origin and Development of Language and Speech</source>
          , vol.
          <volume>280</volume>
          , pp.
          <fpage>20</fpage>
          -
          <lpage>32</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Fillmore</surname>
            ,
            <given-names>C. J.</given-names>
          </string-name>
          (
          <year>1982</year>
          ).
          <article-title>Frame semantics. Linguistics in the morning calm</article-title>
          , pp.
          <fpage>111</fpage>
          -
          <lpage>137</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Harris</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          (
          <year>1954</year>
          ).
          <article-title>Distributional structure</article-title>
          .
          <source>In Jerrold J. Katz and Jerry A</source>
          . Fodor, editors,
          <source>The Philosophy of Linguistics</source>
          , New York. Oxford University Press.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Johnson</surname>
          </string-name>
          , M. And
          <string-name>
            <surname>Lenci</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2011</year>
          ).
          <article-title>Verbs of visual perception in Italian FrameNet</article-title>
          ,
          <source>Advances in Frame Semantics</source>
          ,
          <volume>3</volume>
          (
          <issue>1</issue>
          ),
          <fpage>9</fpage>
          -
          <lpage>45</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Landauer</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Dumais</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>1997</year>
          ).
          <article-title>A solution to Plato's problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge</article-title>
          .
          <source>Psychological Review</source>
          ,
          <volume>104</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Lenci</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          , Johnson, M, Lapesa,
          <string-name>
            <surname>G.</surname>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Building an Italian FrameNet through Semi-automatic Corpus Analysis</article-title>
          .
          <source>Proceedings of LREC 2010</source>
          . Malta.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Lenci</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montemagni</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Venturi</surname>
            ,
            <given-names>G</given-names>
          </string-name>
          , Cutrullà,
          <string-name>
            <surname>M. G.</surname>
          </string-name>
          (
          <year>2012</year>
          ).
          <article-title>Enriching the ISST-TANL Corpus with Semantic Frames in</article-title>
          <source>Proceedings of LREC</source>
          <year>2012</year>
          , Istanbul, Turkey
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kai</surname>
            <given-names>Chen</given-names>
          </string-name>
          , Greg Corrado, and
          <string-name>
            <surname>Jeffrey Dean.</surname>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>Efficient estimation of word representations in vector space</article-title>
          .
          <source>CoRR abs/1301</source>
          .3781. http://arxiv.org/abs/1301.3781.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>Mitchell</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Lapata</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Composition in distributional models of semantics</article-title>
          .
          <source>Cognitive Science</source>
          ,
          <volume>34</volume>
          (
          <issue>8</issue>
          ):
          <fpage>1388</fpage>
          -
          <lpage>1429</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Pado</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Lapata</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2007</year>
          ).
          <article-title>Dependency-based construction of semantic space models</article-title>
          .
          <source>Computational Linguistics</source>
          ,
          <volume>33</volume>
          (
          <issue>2</issue>
          ):
          <fpage>161</fpage>
          -
          <lpage>199</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <surname>Pennacchiotti</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Cao D.</surname>
          </string-name>
          ,
          <string-name>
            <surname>Basili</surname>
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Croce</surname>
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roth</surname>
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2008</year>
          ).
          <article-title>Automatic induction of FrameNet lexical units</article-title>
          .
          <source>In: Proceedings of the EMNLP</source>
          <year>2008</year>
          , Hawaii
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <surname>Pennington</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Socher</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Manning</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>GloVe: Global Vectors for Word Representation</article-title>
          ,
          <source>In Proceedings of EMNLP</source>
          <year>2014</year>
          ,
          <volume>1532</volume>
          -
          <fpage>1543</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <surname>Rossini Favretti</surname>
            <given-names>R.</given-names>
          </string-name>
          , Tamburini F.,
          <string-name>
            <surname>De Santis</surname>
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2002</year>
          ).
          <article-title>CORIS/CODIS: A corpus of written Italian based on a defined and a dynamic model</article-title>
          .
          <source>In A Rainbow of Corpora: Corpus Linguistics and the Languages of the World, Lincom-Europa, Munich</source>
          ,
          <fpage>27</fpage>
          -
          <lpage>38</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <surname>Sahlgren</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          . (
          <year>2006</year>
          ).
          <article-title>The Word-Space Model</article-title>
          .
          <source>Ph.D. thesis</source>
          , Stockholm University.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <surname>Tonelli</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Pianta</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          (
          <year>2008</year>
          ).
          <article-title>Frame information transfer from English to Italian</article-title>
          .
          <source>In Proceedings of LREC</source>
          , Marrekech, Morocco
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <surname>Tonelli</surname>
            ,
            <given-names>S</given-names>
          </string-name>
          , Pighin,
          <string-name>
            <surname>D</surname>
          </string-name>
          , Giuliano,
          <string-name>
            <surname>C</surname>
          </string-name>
          , Pianta,
          <string-name>
            <surname>E.</surname>
          </string-name>
          (
          <year>2009</year>
          ).
          <article-title>Semi-automatic Development of FrameNet for Italian</article-title>
          .
          <source>In Proceedings of the FrameNet Workshop</source>
          and Masterclass, Milano, Italy. Milan, Italy
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <surname>Tonelli</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Pighin</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>2009</year>
          ).
          <article-title>'New Features for FrameNet - WordNet Mapping'</article-title>
          ,
          <source>in Proceedings of CoNLL</source>
          <year>2009</year>
          , Boulder, Colorado,
          <fpage>219</fpage>
          -
          <lpage>227</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <string-name>
            <surname>Tonelli</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>“Semi-automatic techniques for extending the FrameNet lexical database to new languages”</article-title>
          , Università Ca' Foscari, Venezia
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <string-name>
            <given-names>Venturi G.</given-names>
            ,
            <surname>Lenci</surname>
          </string-name>
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Montemagni</surname>
          </string-name>
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Vecchi</surname>
          </string-name>
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Sagri</surname>
          </string-name>
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Tiscornia</surname>
          </string-name>
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Agnoloni</surname>
          </string-name>
          <string-name>
            <surname>T.</surname>
          </string-name>
          (
          <year>2009</year>
          ).
          <article-title>Towards a FrameNet Resource for the Legal Domain</article-title>
          .
          <source>In Proceedings of LOAIT 2009</source>
          . Barcelona, Spain
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>