<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Automatic Induction of FrameNet Lexical Units in Italian</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Silvia Brambillaz</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Danilo Crocey</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fabio Tamburiniz</string-name>
          <email>fabio.tamburinig@unibo.it</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Roberto Basiliy</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>In this paper we investigate the applicability of automatic methods for frame induction to improve the coverage of IFrameNet, a novel lexical resource based on Frame Semantics in Italian. The experimental evaluations show that the adopted methods based on neural word embeddings pave the way for the assisted development of a large scale lexical resource for our language.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        When dealing with large-scale lexical resources,
such as FrameNet
        <xref ref-type="bibr" rid="ref3">(Baker et al., 1998)</xref>
        , PropBank
        <xref ref-type="bibr" rid="ref20">(Palmer et al., 2005)</xref>
        , VerbNet
        <xref ref-type="bibr" rid="ref26">(Schuler, 2005)</xref>
        or VerbAtlas
        <xref ref-type="bibr" rid="ref11">(Di Fabio et al., 2019)</xref>
        , the
semiautomatic association between predicates and
lexical items (also known as Lexical Units or LUs)
is crucial to improve the coverage of a resource
while limiting the costs of its manual
annotation. Several approaches to this semi-supervised
task exist, as discussed in QasemiZadeh et al.
(2019). In particular, Pennacchiotti et al. (2008)
exploited distributional models of lexical
meaning
        <xref ref-type="bibr" rid="ref10 ref25 ref9">(Sahlgren, 2006; Croce and Previtali, 2010)</xref>
        to induce new LUs consistently with the Frame
Semantics theory
        <xref ref-type="bibr" rid="ref3">(Baker et al., 1998)</xref>
        ,
representing words meaning and semantic frames through
geometrical word spaces. As a result, this
approach allows to induce new LUs when applied
to the English version of FrameNet. However, this
is a quite consolidated resource with many
existing LUs connected to each semantic predicate, i.e.,
each frame. The applicability of this method in
scenarios where only one or two LUs are available
for each frame is still an open issue. At the same
      </p>
      <p>
        Copyright c 2020 for this paper by its authors. Use
permitted under Creative Commons License Attribution 4.0
International (CC BY 4.0).
time, since the work of Pennacchiotti et al. (2008),
the application of neural approaches to the
acquisition of word embeddings
        <xref ref-type="bibr" rid="ref18 ref19">(Mikolov et al., 2013;
Baroni et al., 2014; Ling et al., 2015)</xref>
        significantly
improved in terms both of representation
capability and scalability of geometrical models of lexical
semantics.
      </p>
      <p>
        In this paper we thus investigate the
applicability of the method proposed in Pennacchiotti et al.
(2008) to boost the coverage of a novel and still
limited lexical resource based on Frame
Semantics in Italian. This resource has been developed
within the IFrameNet (IFN) project
        <xref ref-type="bibr" rid="ref7">(Basili et al.,
2017)</xref>
        , which aims at creating a large coverage
FrameNet-like resource for Italian and to come up
with a complete dictionary in which every lexical
entry1 is linked to all the frames it can evoke (i.e.,
the frames for which it is a LU). At this moment,
while the resource counts more than 7,700
lexical items associated to more than 1,048 frames,
each lexical item is connected, on average, to only
1.3 frames, and it is problematic if considering the
high polysemy of Italian words
        <xref ref-type="bibr" rid="ref8">(Casadei, 2014)</xref>
        .
      </p>
      <p>The experimental evaluation shows that neural
word embeddings enable the effective application
of the distributional approach from Pennacchiotti
et al. (2008) to improve the coverage of IFN.
Moreover, the adopted distributional framework
allowed to develop a graphical semantic browser
to support annotators while assigning new LUs to
frames. This study paves the way to the
semiautomatic development of IFN and investigates
about the applicability of neural word embeddings
to the incremental semi-automatic LU induction
process.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        In the development of FrameNet and
FrameNetlike resources for new languages, one important
1Where with the term lexical entry we denote a lemma,
with its Part of Speech tag, that activates at least one LU.
task is the creation of a large-scale dictionary, in
order to guarantee an effective application in
semantic analyses or NLP tasks. In fact, the limited
coverage of FrameNet has been addressed as one
of the main reason of failures
        <xref ref-type="bibr" rid="ref21 ref22">(Pennacchiotti et al.,
2008; Pavlick et al., 2015)</xref>
        . For these reasons and
given the high costs of manual annotation, both in
terms of time and resources (i.e., human
annotators), the automatic (or semi-automatic) expansion
of the dictionary for FrameNet and
FrameNetlike resources has received attention during the
years. Several methods to support the population
of frames in FrameNet
        <xref ref-type="bibr" rid="ref1 ref1 ref2 ref21 ref23 ref32 ref33 ref4">(Baker et al., 2007; Pavlick
et al., 2015; Ustalov et al., 2018; QasemiZadeh et
al., 2019; Anwar et al., 2019; Arefyev et al., 2019;
Yong and Torrent, 2020)</xref>
        , and FrameNet-like
resources
        <xref ref-type="bibr" rid="ref14 ref15 ref16 ref29 ref30 ref31 ref4">(Johansson and Nugues, 2007; Tonelli et
al., 2009; Tonelli, 2010; Johansson, 2014; Hayoun
and Elhadad, 2016)</xref>
        with new Lexical Units have
been widely investigated. Some of the
methodologies proposed in order to automatically
expand FrameNet have exploited the alignment
between WordNet and FrameNet data
        <xref ref-type="bibr" rid="ref12 ref15 ref22 ref4">(Johansson
and Nugues, 2007; Pennacchiotti et al., 2008;
Ferra´ndez et al., 2010)</xref>
        . Another strategy is the one
adopted by Pavlick et al. (2015) where the
scholars enlarge FrameNet coverage using automatic
paraphrase. The majority of the works dealing
with automatic frame induction, however, exploits
distributional methods, for example the work on
which this research relies the most, i.e., the work
of Pennacchiotti et al. (2008) or some of the most
recent works such as the ones of Ustalov et al.
(2018), Arefyev et al. (2019) and Yong and Torrent
(2020). Ustalov et al. (2018), for example, model
the frame induction problem as a tri-clustering
problem and use dependency triples automatically
extracted from a Web-scale corpus. Arefyev et al.
(2019) propose to combine dense representations
from hidden layers of a masked language model
with sparse representations based on substitutes
for the target word in the context for the creation
of vector representations.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3 IFrameNet status</title>
      <p>
        The IFrameNet project
        <xref ref-type="bibr" rid="ref7">(Basili et al., 2017)</xref>
        , relied,
as a starting point, on the achievements of previous
researches on the development of Italian resources
annotated according to Frame Semantics
        <xref ref-type="bibr" rid="ref10 ref29 ref30">(Tonelli
and Pianta, 2009; DeCao et al., 2010)</xref>
        , i.e., a set
of automatically induced LUs that were covering
554 frames of the 1; 224 frames in FrameNet.
      </p>
      <p>
        Since the beginning, our main objective has
been to improve the coverage of the resource in
terms of annotated frames, increasing the number
of the LUs and the number of annotated sentences
representing each predicate. Starting from the
results achieved in 2017, we enlarged the dictionary
and provided an initial set of LUs for those frames
without any annotation. We also revised the whole
dictionary and expunged the LUs whose lemma
had low frequency2 in CORIS (Corpus di
Italiano Scritto)
        <xref ref-type="bibr" rid="ref24">(Rossini Favretti et al., 2002)</xref>
        . Since
CORIS is a large-scale and general-purpose Italian
corpus (without biases to any domain), we
speculate that not represented LUs can hardly
characterize a frame in Italian. Moreover, we worked on the
frame annotation of sample sentences taken from
the CORIS corpus. We relied on CORIS because
it is domain independent and suitable to represent
the generic notion of frames. Currently, the
resource contains:
7,776 lexical entries of which: 1; 130
adjectives, 4; 309 nouns and 2; 337 verbs;
10,379 LUs (nouns, verbs and adjectives)
validated in terms of pairs of lexical entries
and evoked frame(s);
      </p>
    </sec>
    <sec id="sec-4">
      <title>1,048 frames with at least one LU among</title>
      <p>which 743 frames are represented with
at least one sentence. Among the 176
frames that still do not have any LU in
their dictionary, 134 are marked as
NonLexical in FrameNet, 12 do not have any
LU in FrameNet, but are not explicitly
marked as Non-Lexical, 18 are not
represented in FrameNet by any noun, verb
or adjective and finally, for just 8 frames,
it was difficult to find LUs in Italian
(e.g. IMPROVISED EXPLOSIVE DEVICE or
SHORT SELLING);</p>
      <sec id="sec-4-1">
        <title>5,208 sentences annotated and validated with</title>
        <p>at least one LU;
an average of 9.9 LUs assigned to each
frame;
an average of 1.3 frames associated to each
LU. Among the existing LUs, 5; 960 are
assigned to only one frame. Given that Italian
language is highly polysemous, it is probable
that many LUs evoke more than one frame.</p>
        <p>This work aims at reducing this limitation.</p>
      </sec>
      <sec id="sec-4-2">
        <title>2Less than 20 occurrences in the corpus.</title>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Automatic Frame Induction</title>
      <p>For the Frame Induction we rely on distributional
methods as in Pennacchiotti et al. (2008),
described hereafter.</p>
    </sec>
    <sec id="sec-6">
      <title>Distributional representation. As a first step,</title>
      <p>
        we obtain a distributional representation of the
CORIS corpus and represent in the wordspace
each LU as a vector ~l. We investigated three
slightly different approaches for the
acquisition of the wordspaces: the Continuous
Bag-ofWords model (CBOW), the Skip-gram model
        <xref ref-type="bibr" rid="ref19">(Mikolov et al., 2013)</xref>
        and the Structured
Skipgram (sskip-gram) model
        <xref ref-type="bibr" rid="ref18">(Ling et al., 2015)</xref>
        .
The sskip-gram is a modification of the
skip-gram model, sensitive to the positioning
of the words and, thus, more suitable for
capturing syntactic properties of the words
        <xref ref-type="bibr" rid="ref18">(Ling et
al., 2015)</xref>
        . Our hypothesis is that this last model
would be more suitable for capturing LUs frame
properties since syntax is, in general, in agreement
with semantic arguments (i.e., Frame Elements,
FEs) and their order.
“Framehood” representation. As a second step,
we exploit the obtained embeddings to represent
the meaning of frames. We assume that a frame f
can be described by the set of its LUs l 2 F and
that LUs vectors ~l can be thus used to acquire a
distributional representation for each frame. In a
nutshell, for each frame we: (i) select all the LUs
of its dictionary, (ii) apply to LUs vectors ~l a
clustering algorithm. A frame will be then represented
as a set of clusters: given that each frame can have
various nuances and that it can be representative
of non overlapping senses, sparse in the
semantic space, we represent it through its “clusters of
senses”. This captures, in the semantic space, the
possible “framehood” distributions, as dense
regions of LUs. In this work, we applied standard
K-means
        <xref ref-type="bibr" rid="ref13">(Hartigan and Wong, 1979)</xref>
        , so that each
frame is represented as a set of k clusters. For each
frame k is empirically set to the square root of the
number of LUs l in that frame: k = pjlj, where
jlj denotes the count of l per frame. In this way,
each f will have k clusters depending on the
number of its LUs and the centroid of each cluster will
represent the prototype for a subset of the senses
of a frame.
      </p>
      <p>New LU induction. Once obtained the
distributional representations for frames and LUs, the
third step involves the automatic induction of
frames given a candidate lexical item. For each
POS
a
n
v
a-n-v
candidate predicate word, we computed the
distance between its vector and the sets of clusters
representing the frames. The “nearest” clusters
will be the ones containing a set of LUs more
closely related to the input lexical item, so that
the corresponding frames will be suggested as its
evoking frames.
5</p>
    </sec>
    <sec id="sec-7">
      <title>Experimental Evaluation</title>
      <p>In order to assess the quality of the
proposed method, we evaluate its capability in
rediscovering the frames manually associated to a
lexical item. We apply a leave-one-out schema:
for each candidate lexical item, we eliminate it
from the dictionary and query the model to
“suggest” up to 10 frames. In practice, we rebuild the
clusters and then compute the distance between
the lexical item’s vector and the set of clusters
representing all frames. Then, we compare the
suggested frames with the frames that were
originally linked to the LU. As in Pennacchiotti et al.
(2008), we compute Accuracy as the fraction of
LUs that are correctly re-assigned to the original
frame. Accuracy is computed at different levels
b: a LU is correctly assigned if one of its gold
standard frames appears among the best-b frames
ranked by the model. In fact, as LUs can have
more than one correct frame, we deem as
“correct” an assignment for which at least one of the
correct frames is among the best-b.</p>
      <p>The model is evaluated by sampling the test bed
according two dimensions, as reported in Table
1. First, we considered the Part-of-Speech (POS)
of the LUs (i.e., rows in Table 1). In fact,
lexical items having different POS are generally
projected in different sub-spaces within word spaces.
We thus evaluate the model considering separately
LUs and frames containing adjectives (a), nouns
(n) or verbs (v). For the sake of completeness, we
also evaluated the model without any selection by
POS (row a-n-v). When a frame does not contain
any LU represented in the wordspace with a
required POS, it is discarded during the evaluation:
as an example, the actual dictionary contains 631
frames containing at least one noun.</p>
      <p>Then, we filtered frames by applying a
threshold to the number of LUs a frame should be
connected to, in order to be considered (columns in
Table 1), as it follows: first, we considered all
frames containing at least one LU whose lemma
occurred at least 20 times in CORIS, without
applying any other restriction (column 1); then we
filtered frames with at least 2 valid LUs3 (column
2); finally we filtered frames with at least 5 valid
LUs (column 5). Both filter policies can be
combined and the stricter these policies are, the lower
the number of frames considered in the
evaluation. As a consequence, the Accuracy baseline of
a model which randomly assigns LUs to frames
depends on the number of selected frames: when
no filter is applied (row a n v and column 1) a
1
random assignment would achieve 0:09% = 1;041
1
of Accuracy, or 0:4% = 250 when only frames
containing at least 5 nouns are selected.</p>
      <p>
        Table 2 reports the experimental results of a
model derived using a sskip-gram model
        <xref ref-type="bibr" rid="ref18">(Ling
et al., 2015)</xref>
        4. If we consider the performance over
only nouns (n) we see that, when a reasonable
threshold is set (row th = 2), in 48% of cases
in first position we find one of the original frames
evoked by the noun under analysis (column b 1).
If we consider the first two frames proposed by the
system (b 2) the Accuracy rises up to 61% and
it keeps increasing as we consider more frames. It
is impressive if considering that the corresponding
random baseline is 0:2% = 4613 and 0:4% = 4623 .
If we jointly consider nouns, verbs and adjectives
3This threshold also overcomes the intrinsic limitation
of the leave-one-out schema; when considering frames with
only one LU, it becomes impossible to spot the original frame
in the test data because it will not be represented by any LU.
      </p>
      <p>
        4This method outperformed the CBOW and skip-gram,
not reported here for lack of space.
(a-n-v) the performance is slightly lower: for
example, with the same threshold th = 2 and
considering only two suggested frames (b 2) the
Accuracy is 61%. It means that, on average, the
model capability of assigning LUs (ignoring their
POS) to frames is slightly lower. This is confirmed
by the general drop obtained when only verbs or
adjectives are considered: for verbs, considering
only the best suggestion (b 1) we measured 25%,
if we don’t apply any threshold, to 32%, if we
consider th = 2, to 42% if we consider th = 5.
This is mainly due to higher polysemy
characterizing verbs and adjectives with respect to nouns
        <xref ref-type="bibr" rid="ref8">(Casadei, 2014)</xref>
        . Anyway, this result is
straightforward if considering that for verbs, the baseline
in the setting th = 2 and b = 1 corresponds to
0:2% = 5114 .
      </p>
      <p>Discussion. It is worth noting that our
dictionary is largely incomplete and thus some of those
counted as “incorrect assignements” are instead
frames that are evoked by the LU under analysis
and that should be added to the dictionary.
Moreover, we can see that many of the b 10 frames
are often related at different degrees with the
lexical entry under analysis and with the frames for
which it is a LU.</p>
      <p>For example, when considering the lexical
entry “impiccare.v” (hang.v) the model does not
retrieve among the b 10 suggestions the only
“correct” frame, i.e., the frame EXECUTION.
Anyway, the closest frame identified is the frame
KILLING that not only is linked with
EXECUTION with an Inheritance relation, but also
appears to be evoked by “impiccare.v”. Again,
the system is not able to re-assign the lexical
entries “innalzarsi.v” (raise.v and rise.v),
“innocenza.n” (innocence.n) and “radiazione.n”
(radiation.n or expulsion.n) . Anyway, in the b 10
of “innalzarsi.v” appears in fourth position the
frame CHANGE POSITION ON A SCALE that can
be evoked by “innalzarsi.v” in sentences such as
“La marea si innalzava” (The tide was rising) and
in the b 10 of “innocenza.n” appears, in first
position, the frame CANDIDNESS that is evoked
by this LU in sentences such as “Lei rispose
con innocenza” (She answered genuinely). The
term “radiazione.n” is present in the dictionary
only with the meaning expulsion.n and it is linked
only to EXCLUDE MEMBER. Nevertheless, the
system proposes the frame NUCLEAR PROCESS
in first position and retrieves one correct
meaning of a LU like “radiation.n”. For “alleato.a”
(ally.n, also shown in Figure 1) the system
proposes a “correct” frame in ninth position.
Anyway, we find in second position the frame
MEMBER OF MILITARY that can be plausibly evoked.
Moreover the LU “agnello.n” (lamb.n) evokes in
the dictionary only the frame FOOD; anyway, as
correctly suggested by the system, it is also LU
of the frame ANIMALS. Moreover for “agnello.n”
the system proposes also, in sixth position,
PEOPLE BY MORALITY that recalls the idea of
innocence and righteousness that represents (at least
for the Italian language) a metaphorical extension
of the meaning of “lamb.n”, strongly influenced by
the religious image of the lamb.</p>
      <p>In some other cases, the system suggests
relations between frames. For example, if we
consider the lexical entry “identico.a” (identical.a
from IDENTICALITY) we see in the best-10 frames
that the system proposes frames such as
SIMILARITY (first position) or DIVERSITY (seventh
position). If we look at the frame-to-frame relations in
FrameNet, we see that IDENTICALITY and
SIMILARITY or IDENTICALITY and DIVERSITY are
not directly connected even if they appear, at a
close analysis, strictly related.</p>
    </sec>
    <sec id="sec-8">
      <title>6 IFrameNet Navigator</title>
      <p>In order to make the model valuable for the
annotators, we also developed a Graphical User
Interface, called IFrameNet Navigator. It allows
querying and navigating the geometrical representation
of semantic phenomena as it displays, for each
lexical entry in the dictionary, the best-10 frames.
These can be also selected to browse the set of
LUs assigned to the cluster underlying the frame,
as shown in Figure 1. Finally, each LU can be
selected to browse the list of corresponding
annotated sentences.</p>
      <p>The objectives of the Navigator are: (i) to
support the analysis of the currently modeled lexical
entries (and the corresponding LUs); (ii) to
support the validation of the current sentence
classification; (iii) the mining of the CORIS corpus for
improving the semantic coverage of the resource
for the Italian language; (iv) in perspective, to
offer support towards crowd sourcing.</p>
      <p>This tool will be publicly released to trigger
collaborative validation and annotation as an
extension of the IFrameNet and the CORIS resources.
7</p>
    </sec>
    <sec id="sec-9">
      <title>Conclusions and Research Perspectives</title>
      <p>In this work, we presented the actual state of the
IFrameNet project, which aims at developing a
large-scale lexical resource based on Frame
Semantics in Italian. Moreover, we investigated the
applicability of a method for the automatic
Induction of FrameNet Lexical Units to improve the
coverage of the actual resource, in terms of
number of frames assigned to the almost 8,000 existing
lexical entries.</p>
      <p>With respect to previous work, i.e.,
Pennacchiotti et al. (2008) we empirically demonstrate
the beneficial impact of neural word embeddings
in the overall workflow in Italian. The robustness
of the adopted model is confirmed also when
applied to a resource with a limited average
number of frames associated to Lexical Units. The
experimental evaluations in many cases showed the
valuable support of the method in discovering new
Lexical Units by suggesting novel evoked frames.
Moreover, the error analysis suggested that most
of the “discarded” frames still entertain various
kinds of relationships with the “correct” ones as
defined in FrameNet, such as Inheritance or
Usage. In some cases, it also highlighted
metaphorical meanings that the lexical entries could assume.</p>
      <p>
        As a future work, we will certainly exploit the
produced IFrameNet Navigator to extend the
current LU Italian dictionary, support the annotation
of novel sentences and introduce frame-to-frame
relations in Italian. Another path that might worth
investigating is the exploitation of
dependencybased word embeddings for the distributional
representation of LUs and frames. This may
beneficial since dependency-based contexts highlight
more functional similarities
        <xref ref-type="bibr" rid="ref17 ref5">(Levy and Goldberg,
2014)</xref>
        . Finally, we plan to use the derived frame
distributions to augment existing contextualized
embeddings in support of Frame Induction
        <xref ref-type="bibr" rid="ref11 ref27 ref28">(Sikos
and Pado´ , 2019)</xref>
        or Semantic Role Labeling
        <xref ref-type="bibr" rid="ref11 ref27 ref28">(Shi
and Lin, 2019)</xref>
        tasks.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Saba</given-names>
            <surname>Anwar</surname>
          </string-name>
          , Dmitry Ustalov, Nikolay Arefyev, Simone Paolo Ponzetto, Chris Biemann, and
          <string-name>
            <given-names>Alexander</given-names>
            <surname>Panchenko</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Hhmm at semeval2019 task 2: unsupervised frame induction using contextualized word embeddings</article-title>
          . arXiv preprint arXiv:
          <year>1905</year>
          .01739.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Nikolay</given-names>
            <surname>Arefyev</surname>
          </string-name>
          , Boris Sheludko, Adis Davletov, Dmitry Kharchev, Alex Nevidomsky, and
          <string-name>
            <given-names>Alexander</given-names>
            <surname>Panchenko</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Neural granny at semeval-2019 task 2: A combined approach for better modeling of semantic relationships in semantic frame induction</article-title>
          .
          <source>In Proceedings of the 13th International Workshop on Semantic Evaluation</source>
          , pages
          <fpage>31</fpage>
          -
          <lpage>38</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Collin F. Baker</surname>
            ,
            <given-names>Charles J.</given-names>
          </string-name>
          <string-name>
            <surname>Fillmore</surname>
            ,
            <given-names>and John B.</given-names>
          </string-name>
          <string-name>
            <surname>Lowe</surname>
          </string-name>
          .
          <year>1998</year>
          .
          <article-title>The Berkeley FrameNet project</article-title>
          .
          <source>In Proc. of COLING-ACL</source>
          , Montreal, Canada.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Collin F Baker</surname>
            ,
            <given-names>Michael</given-names>
          </string-name>
          <string-name>
            <surname>Ellsworth</surname>
            , and
            <given-names>Katrin</given-names>
          </string-name>
          <string-name>
            <surname>Erk</surname>
          </string-name>
          .
          <year>2007</year>
          . Semeval-2007 task 19:
          <article-title>Frame semantic structure extraction</article-title>
          .
          <source>In Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)</source>
          , pages
          <fpage>99</fpage>
          -
          <lpage>104</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Georgiana</given-names>
            <surname>Dinu</surname>
          </string-name>
          , and Germa´n
          <year>2014</year>
          .
          <article-title>Don't count, predict! a systematic comparison of context-counting vs</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <article-title>context-predicting semantic vectors</article-title>
          .
          <source>In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)</source>
          , pages
          <fpage>238</fpage>
          -
          <lpage>247</lpage>
          , Baltimore, Maryland, June. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Roberto</given-names>
            <surname>Basili</surname>
          </string-name>
          , Silvia Brambilla, Danilo Croce, and
          <string-name>
            <given-names>Fabio</given-names>
            <surname>Tamburini</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Developing a large scale framenet for italian: the iframenet experience</article-title>
          . CLiC-it
          <year>2017</year>
          11-
          <issue>12</issue>
          <year>December 2017</year>
          , Rome, page
          <volume>59</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Federica</given-names>
            <surname>Casadei</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>La polisemia nel vocabolario di base dell'italiano</article-title>
          .
          <source>Lingue e Linguaggi</source>
          ,
          <volume>12</volume>
          :
          <fpage>35</fpage>
          -
          <lpage>52</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Danilo</given-names>
            <surname>Croce</surname>
          </string-name>
          and
          <string-name>
            <given-names>Daniele</given-names>
            <surname>Previtali</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Manifold learning for the semi-supervised induction of FrameNet predicates: An empirical investigation</article-title>
          .
          <source>In Proceedings of the 2010 Workshop on GEometrical Models of Natural Language Semantics</source>
          , pages
          <fpage>7</fpage>
          -
          <lpage>16</lpage>
          , Uppsala, Sweden, July. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Diego</surname>
            <given-names>DeCao</given-names>
          </string-name>
          , Danilo Croce, and
          <string-name>
            <given-names>Roberto</given-names>
            <surname>Basili</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Extensive evaluation of a framenet-wordnet mapping resource</article-title>
          .
          <source>In Nicoletta Calzolari (Conference Chair)</source>
          , Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, and Daniel Tapias, editors,
          <source>Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)</source>
          , Valletta, Malta, may.
          <source>European Language Resources Association (ELRA).</source>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Andrea</given-names>
            <surname>Di</surname>
          </string-name>
          <string-name>
            <surname>Fabio</surname>
          </string-name>
          , Simone Conia, and
          <string-name>
            <given-names>Roberto</given-names>
            <surname>Navigli</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Verbatlas: a novel large-scale verbal semantic resource and its application to semantic role labeling</article-title>
          .
          <source>In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)</source>
          , pages
          <fpage>627</fpage>
          -
          <lpage>637</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Oscar</given-names>
            <surname>Ferra</surname>
          </string-name>
          <article-title>´ndez, Michael Ellsworth</article-title>
          , Rafael Munoz, and Collin F Baker.
          <year>2010</year>
          .
          <article-title>Aligning framenet and wordnet based on semantic neighborhoods</article-title>
          .
          <source>In LREC</source>
          , volume
          <volume>10</volume>
          , pages
          <fpage>310</fpage>
          -
          <lpage>314</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Hartigan</surname>
          </string-name>
          and
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Wong</surname>
          </string-name>
          .
          <year>1979</year>
          .
          <article-title>A k-means clustering algorithm</article-title>
          .
          <source>JSTOR: Applied Statistics</source>
          ,
          <volume>28</volume>
          (
          <issue>1</issue>
          ):
          <fpage>100</fpage>
          -
          <lpage>108</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Avi</given-names>
            <surname>Hayoun</surname>
          </string-name>
          and
          <string-name>
            <given-names>Michael</given-names>
            <surname>Elhadad</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>The hebrew framenet project</article-title>
          .
          <source>In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)</source>
          , pages
          <fpage>4341</fpage>
          -
          <lpage>4347</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>Richard</given-names>
            <surname>Johansson</surname>
          </string-name>
          and
          <string-name>
            <given-names>Pierre</given-names>
            <surname>Nugues</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Using wordnet to extend framenet coverage</article-title>
          .
          <source>In Proceedings of the Workshop on Building Frame-semantic Resources for Scandinavian and Baltic Languages, at NODALIDA</source>
          , pages
          <fpage>27</fpage>
          -
          <lpage>30</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <given-names>Richard</given-names>
            <surname>Johansson</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Automatic expansion of the swedish framenet lexicon: Comparing and combining lexicon-based and corpus-based methods</article-title>
          .
          <source>Constructions and Frames</source>
          ,
          <volume>6</volume>
          (
          <issue>1</issue>
          ):
          <fpage>92</fpage>
          -
          <lpage>113</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <given-names>Omer</given-names>
            <surname>Levy</surname>
          </string-name>
          and
          <string-name>
            <given-names>Yoav</given-names>
            <surname>Goldberg</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Dependencybased word embeddings</article-title>
          .
          <source>In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)</source>
          , pages
          <fpage>302</fpage>
          -
          <lpage>308</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <given-names>Wang</given-names>
            <surname>Ling</surname>
          </string-name>
          , Chris Dyer, Alan W Black, and
          <string-name>
            <given-names>Isabel</given-names>
            <surname>Trancoso</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Two/too simple adaptations of word2vec for syntax problems</article-title>
          .
          <source>In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , pages
          <fpage>1299</fpage>
          -
          <lpage>1304</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <given-names>Tomas</given-names>
            <surname>Mikolov</surname>
          </string-name>
          , Ilya Sutskever, Kai Chen, Greg S Corrado, and
          <string-name>
            <given-names>Jeff</given-names>
            <surname>Dean</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Distributed Representations of Words and Phrases and their Compositionality</article-title>
          . In C. J.
          <string-name>
            <surname>C. Burges</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Bottou</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Welling</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Ghahramani</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          K. Q. Weinberger, editors,
          <source>Advances in Neural Information Processing Systems</source>
          <volume>26</volume>
          , pages
          <fpage>3111</fpage>
          -
          <lpage>3119</lpage>
          . Curran Associates, Inc.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <given-names>Martha</given-names>
            <surname>Palmer</surname>
          </string-name>
          , Paul Kingsbury, and
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Gildea</surname>
          </string-name>
          .
          <year>2005</year>
          .
          <article-title>The proposition bank: An annotated corpus of semantic roles</article-title>
          .
          <source>Computational Linguistics</source>
          ,
          <volume>31</volume>
          (
          <issue>1</issue>
          ):
          <fpage>71</fpage>
          -
          <lpage>106</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <given-names>Ellie</given-names>
            <surname>Pavlick</surname>
          </string-name>
          , Travis Wolfe, Pushpendre Rastogi, Chris Callison-Burch,
          <string-name>
            <given-names>Mark</given-names>
            <surname>Dredze</surname>
          </string-name>
          , and Benjamin Van Durme.
          <year>2015</year>
          .
          <article-title>Framenet+: Fast paraphrastic tripling of framenet</article-title>
          .
          <source>In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)</source>
          , pages
          <fpage>408</fpage>
          -
          <lpage>413</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <given-names>Marco</given-names>
            <surname>Pennacchiotti</surname>
          </string-name>
          , Diego De Cao, Roberto Basili, Danilo Croce, and
          <string-name>
            <given-names>Michael</given-names>
            <surname>Roth</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Automatic induction of framenet lexical units</article-title>
          .
          <source>In Proceedings of the 2008 conference on empirical methods in natural language processing</source>
          , pages
          <fpage>457</fpage>
          -
          <lpage>465</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <surname>Behrang</surname>
            <given-names>QasemiZadeh</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miriam R. L. Petruck</surname>
            , Regina Stodden, Laura Kallmeyer, and
            <given-names>Marie</given-names>
          </string-name>
          <string-name>
            <surname>Candito</surname>
          </string-name>
          .
          <year>2019</year>
          . SemEval
          <article-title>-2019 task 2: Unsupervised lexical frame induction</article-title>
          .
          <source>In Proceedings of the 13th International Workshop on Semantic Evaluation</source>
          , pages
          <fpage>16</fpage>
          -
          <lpage>30</lpage>
          , Minneapolis, Minnesota, USA, June. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <given-names>Rema</given-names>
            <surname>Rossini</surname>
          </string-name>
          <string-name>
            <surname>Favretti</surname>
          </string-name>
          , Fabio Tamburini, and Cristiana De Santis.
          <year>2002</year>
          .
          <article-title>Coris/codis: A corpus of written italian based on a defined and a dynamic model. A rainbow of corpora: Corpus linguistics and the languages of the world</article-title>
          , pages
          <fpage>27</fpage>
          -
          <lpage>38</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <string-name>
            <given-names>Magnus</given-names>
            <surname>Sahlgren</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>The Word-Space Model</article-title>
          .
          <source>Ph.D. thesis</source>
          , Stockholm University.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <string-name>
            <given-names>Karin</given-names>
            <surname>Kipper Schuler</surname>
          </string-name>
          .
          <year>2005</year>
          .
          <article-title>VerbNet: A broadcoverage, comprehensive verb lexicon</article-title>
          .
          <source>Ph.D. thesis</source>
          , University of Pennsylyania.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          <string-name>
            <given-names>Peng</given-names>
            <surname>Shi</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jimmy</given-names>
            <surname>Lin</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Simple BERT models for relation extraction and semantic role labeling</article-title>
          .
          <source>CoRR</source>
          , abs/
          <year>1904</year>
          .05255.
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          <string-name>
            <given-names>Jennifer</given-names>
            <surname>Sikos</surname>
          </string-name>
          and Sebastian Pado´.
          <year>2019</year>
          .
          <article-title>Frame identification as categorization: Exemplars vs prototypes in embeddingland</article-title>
          .
          <source>In Proceedings of the 13th International Conference on Computational Semantics - Long Papers</source>
          , pages
          <fpage>295</fpage>
          -
          <lpage>306</lpage>
          , Gothenburg, Sweden, May. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          <string-name>
            <given-names>Sara</given-names>
            <surname>Tonelli</surname>
          </string-name>
          and
          <string-name>
            <given-names>Emanuele</given-names>
            <surname>Pianta</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Three issues in cross-language frame information transfer</article-title>
          .
          <source>In Proceedings of the International Conference RANLP-2009</source>
          , pages
          <fpage>441</fpage>
          -
          <lpage>448</lpage>
          , Borovets, Bulgaria, September. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          <string-name>
            <given-names>Sara</given-names>
            <surname>Tonelli</surname>
          </string-name>
          , Daniele Pighin, Claudio Giuliano, and
          <string-name>
            <given-names>Emanuele</given-names>
            <surname>Pianta</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Semi-automatic development of framenet for italian</article-title>
          .
          <source>In Proceedings of the FrameNet Workshop</source>
          and Masterclass, Milano, Italy.
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          <string-name>
            <given-names>Sara</given-names>
            <surname>Tonelli</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Semi-automatic techniques for extending the FrameNet lexical database to new languages</article-title>
          .
          <source>Ph.D. thesis</source>
          , Universita` Ca'Foscari Venezia.
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          <string-name>
            <given-names>Dmitry</given-names>
            <surname>Ustalov</surname>
          </string-name>
          , Alexander Panchenko, Andrei Kutuzov, Chris Biemann, and Simone Paolo Ponzetto.
          <year>2018</year>
          .
          <article-title>Unsupervised semantic frame induction using triclustering</article-title>
          . arXiv preprint arXiv:
          <year>1805</year>
          .04715.
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          <source>Zheng Xin Yong and Tiago Timponi Torrent</source>
          .
          <year>2020</year>
          .
          <article-title>Semi-supervised deep embedded clustering with anomaly detection for semantic frame induction</article-title>
          .
          <source>In Proceedings of The 12th Language Resources and Evaluation Conference</source>
          , pages
          <fpage>3509</fpage>
          -
          <lpage>3519</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>