<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Grounding the Lexical Sets of Causative-Inchoative Verbs with Word Embedding</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Edoardo Maria Ponti</string-name>
          <email>ep490@cam.ac.uk</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elisabetta Jezek</string-name>
          <email>jezek@unipv.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bernardo Magnini</string-name>
          <email>magnini@fbk.eu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Fondazione Bruno Kessler</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Università degli Studi di Pavia</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Cambridge</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>English. Lexical sets contain the words filling the argument positions of a verb in one of its senses. They can be extracted from corpora automatically. The purpose of this paper is demonstrating that their vector representation based on word embedding provides insights onto many linguistic phenomena, such as causativeinchoative verbs. A first experiment aims at investigating the internal structure of the sets, which are known to be radial and continuous categories cognitively. A second experiment shows that the distance between the intransitive subject set and transitive object set is correlated with the spontaneity of the event expressed by the verb, defined according to morphological coding and frequency.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Italiano. I set lessicali contengono le
parole che occupano le posizioni
argomentali di un verbo in una delle sue
accezioni, e possono essere estratti in modo
automatico dai corpora. L’obiettivo di
questo articolo è dimostrare che la loro
rappresentazione vettoriale illumina
alcuni fenomeni linguistici, come i verbi
ad alternanza causativo-incoativa. Un
esperimento investiga la struttura
interna degli insiemi, che a livello
cognitivo sono ritenuti categorie radiali e
continue. Inoltre, un secondo
esperimento mostra che la distanza fra l’insieme
dei soggetti intransitivi e l’insieme degli
oggetti transitivi è correlata alla
spontaneità dell’evento espresso dal verbo,
definita secondo la marca morfologica e
la frequenza.</p>
    </sec>
    <sec id="sec-2">
      <title>1 Introduction</title>
      <p>
        Lexicographic attempts to cope with verb sense
disambiguation often rely on “lexical sets”
        <xref ref-type="bibr" rid="ref9">(Hanks, 1996)</xref>
        , which represent the lists of
corpusderived words that appear as arguments for each
distinct verb sense. The arguments are the “slots”
that have to be filled to satisfy the valency of a verb
(subject, object, etc.). For example, {gun, bullet,
shot, projectile, rifle...} is the lexical set of the
object for the sense ‘to shoot’ of to fire. In
previous works, e.g. Montemagni et al. (1995),
lexical sets were collected manually and were
compared through set analysis. The measure of
similarity between two sets was proportional to the
extent of their intersection. We believe that possible
improvements may stem from deriving the lexical
sets automatically and from exploiting the
semantic information of the fillers fully. In this work,
we devise an extraction method from a huge
corpus and use a distributional semantics approach to
perform our analyses. More specifically, we
represent fillers as word vectors and compare them with
spatial distance measures. In order to test the
relevance for linguistic theory of this approach, we
focus on a case study, namely the properties of
verbs undergoing the causative-inchoative
alternation. Section 1.1. outlines a framework for word
embeddings and section 1.2 introduces the case
study. Section 2 presents the method and the data,
whereas section 3 reports the results of a couple of
experiments.
1.1
      </p>
      <sec id="sec-2-1">
        <title>Word Embedding</title>
        <p>
          The full exploitation of the semantic information
inherent to argument fillers for verbs can take
advantage from some recent developments in
distributional semantics. Recently, efficient algorithms
have been devised mapping each word of a
vocabulary into a corresponding vector of n real
numbers, which can be thought as a sequence of
coordinates in a n-dimensional space
          <xref ref-type="bibr" rid="ref19">(Mikolov et
al., 2013)</xref>
          . This mapping is yielded by
unsupervised machine learning, based on the assumption
that the meaning of a word can be inferred by its
context, i.e. its neighbouring words in texts. This
model has some relevant properties: the
geometric closeness of two vectors corresponds to the
similarity in meaning of the corresponding words.
Moreover, its dimensions have possibly a semantic
interpretation.
1.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Causative-Inchoative Alternation</title>
        <p>A possible testbed for the usefulness of
representing the argument fillers as vectors are the verbs
showing the so called causative-inchoative
alternation. These verbs appear either as transitive or
intransitive. In the first case, an agent brings about
a change of state; in the second, the change of a
patient is presented as spontaneous (e.g. to break,
as in “Mary broke the key” vs. “the key broke”).</p>
        <p>
          The two alternative forms of these verbs can
be morphologically asymmetrical: in this case,
one has a derivative affix and the other does not.
The first is labelled here as “marked”, the
second as “basic”. Italian verbs with an asymmetrical
alternation derive from the phenomenon of
anticausativization. The intransitive form is marked
since it is sometimes preceded by the clitic si
          <xref ref-type="bibr" rid="ref3">(Cennamo and Jezek, 2011)</xref>
          . Haspelmath (1993)
maintain that verbs that show a preference for
a marked causative form (and a basic inchoative
form) cross-linguistically denote a more
“spontaneous” situation. Spontaneity is intended by the
author as the likelihood of the occurrence of the
event without the intervention of an agent. This
work is non-committal with respect to whether
spontaneity be an actual semantic factor. Rather,
it is considered a notion useful for labelling the
observed variations in morphology and frequency.
        </p>
        <p>In this way, a correlation between the form
and the meaning of these verbs was
demonstrated. Moreover, Samardzic and Merlo (2012)
and Haspelmath et al. (2014) argue that verbs
that appear more frequently (intra- and
crosslinguistically) in the inchoative form tend to
morphologically derive the causative form, too. This
time, the correlation holds between form and
frequency. Vice versa, situations entailing agentive
participation prefer to mark the inchoative form
and occur more frequently in the causative form.
2</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Previous Work</title>
      <p>
        In the literature, many methods are available for
the automatic detection of verb classes, such as
causative-inchoative verbs. They exploit features
based on argument alternations, such as
subcategorization frames
        <xref ref-type="bibr" rid="ref14">(Joanis et al., 2008)</xref>
        . The
identification of verb classes displaying a diathesis
alternation was also performed through the analysis of
selectional preferences. Most notably, the lexical
items were compared via distributional semantics
        <xref ref-type="bibr" rid="ref15 ref17">(McCarthy, 2000)</xref>
        .
      </p>
      <p>
        These features were usually induced from
automatic parses of heterogeneous and wide corpora
        <xref ref-type="bibr" rid="ref24">(Schulte Im Walde, 2000)</xref>
        . In particular, the
extraction of subcategorization frames was refined
including e.g. noise filters based on frequency
        <xref ref-type="bibr" rid="ref15">(Korhonen et al., 2000)</xref>
        . Our work is inspired by
these attempts to automatically induce lexical
information regarding verbs, but its direction of
research is reversed. Indeed, rather than
classifying verb classes given this information, it analyses
this information given a verb class in order to shed
light on its properties from the perspective of
linguistic theory.
3
      </p>
    </sec>
    <sec id="sec-4">
      <title>Data and Method</title>
      <p>
        The data are sourced from a sample of ItWac, a
wide Italian corpus gathered through web crawling
        <xref ref-type="bibr" rid="ref1">(Baroni et al., 2009)</xref>
        . This sample was further
enriched with morpho-syntactic information through
the MATE-tools parser
        <xref ref-type="bibr" rid="ref2">(Bohnet, 2010)</xref>
        1 and
filtered by sentence length (&lt; 100). Eventually,
sentences in the sample amounted to 2,029,454
items. A target group of 20 causative-inchoative
verbs was taken from Haspelmath et al. (2014):
they are listed here based on the reported
transitive/intransitive frequency ratio, from the highest
to the lowest.
      </p>
      <p>close &gt; open &gt; improve &gt; break &gt; fill &gt; gather &gt; connect
&gt; split &gt; stop &gt; go out &gt; rise &gt; rock &gt; burn &gt; freeze &gt;
turn &gt; dry &gt; wake &gt; melt &gt; boil &gt; sink</p>
      <p>
        The extraction step consisted in identifying
their argument fillers inside the sentences in the
sample. In particular, the arguments considered
were the subjects of intransitives (S) and objects
1LAS scores for the relevant dependency relations: 0.751
with dobj (direct object), 0.719 with nsubj (subject), 0.691
with nsubjpass (subject of a passive verb).
(O)
        <xref ref-type="bibr" rid="ref5">(Dixon, 1994)</xref>
        .2 These arguments are relevant
because they are deemed to share the same fillers
        <xref ref-type="bibr" rid="ref21">(Pustejovsky, 1995)</xref>
        .
      </p>
      <p>
        These operations resulted in a database where
each verb lemma had a single entry and was
associated with a list of fillers, divided by argument
type. With this procedure, lexical sets were
extracted automatically, although they were not
divided by verb sense. Afterwards, each of the
argument fillers was mapped to a vector relying on a
space model pre-trained through Word2Vec
        <xref ref-type="bibr" rid="ref4">(Dinu
et al., 2015)</xref>
        .3
4
      </p>
    </sec>
    <sec id="sec-5">
      <title>Experiments</title>
      <p>
        In order to bring to light the linguistic
information concealed in the automatically extracted
lexical sets, we devised two experiments. One
investigates the internal structure of lexical sets. In fact,
previous works based on set theory treated them as
categoric sets, of which a filler is either a member
or not. Research in psychology, however, has long
since demonstrated that the members of a
linguistic set are found in a radial continuum where the
most central one is the prototype for its category,
and those at the periphery are less representative
        <xref ref-type="bibr" rid="ref16 ref22">(Rosch, 1973; Lakoff, 1987)</xref>
        .4 Word vectors allow
to capture this spatial continuum.
      </p>
      <p>2Subjects of forms with si were treated as intransitive
subjects. Subjects of passive verbs were treated as objects.</p>
      <p>3It was generated by a CBOW algorithm with negative
sampling, 300 dimensions, a context window of 10 tokens,
pruning of infrequent words and sub-sampling.</p>
      <p>4For previous work on lexical sets considering
prototypicality in the context of the notion of shimmering, see Jezek
and Hanks (2010).</p>
      <p>Once the fillers have been mapped to their
respective vectors, a lexical set appears as a group
of points in a multi-dimensional model. The
centre of this group is the Euclidean mean among the
vectors, which is a vector itself and is called
centroid. In the first experiment, we calculated the
coordinates of the centroid of the lexical sets S and O
for any selected verb5. Then we evaluated the
cosine similarity of every vector member of the sets
from its centroid. The value of this metric goes
from 0 (overlap) to 1 (maximum distance) and is
useful to evaluate how far a filler is from its
prototype. We obtained two sets of cosine similarity
values for each verb: these can be plotted as boxes
and whiskers, like in Figure 1. The example
represents those of dividere ‘to split’. The rectangles
stand for the values in the second and third
quartiles, whereas the horizontal line for the median6.
From all these distance values, we picked the
median value for each lexical set. The plot of these
medians for the S set and the O set of each verb
ordered according to Haspelmath’s ranking is shown
in Figure 2.</p>
      <p>Two main results can be observed from these
plots: the S lexical set lies in a more compact
range of distances, whereas O is more scattered.
On the other hand, the vectors of S tend farther
from the centroid. This is demonstrated by the
ranges where their distance values fall. Moreover,
the averages of medians for the ten verbs on the
left part of the scale (frequently transitive) and for
the ten verbs on the right (frequently intransitive)
were compared. The average median in S was
0.696567 for the former and 0.585263 for the
latter. The average median in O was 0.556878 for
the former and 0.522418 for the latter. This shows
that the variation in O appears to be random. On
the other hand, the median of the distances in S is
normally lower for verbs that lie in the bottom half
of the Haspelmath’s scale.</p>
      <p>The second experiment consisted in estimating
the cosine distance between the centroid of S and
the centroid of O for each verb. This operation was
aimed at finding to which extent the lexical sets of
S and O overlap. In fact, Montemagni et al. (1995)
and McCarthy (2000) assessed in a corpus some
asymmetries between these lexical sets, which in
principle should share all their members.</p>
      <p>5Every filler was weighted proportionally to its absolute
frequency.</p>
      <p>6The median is the value separating the higher half of the
ordered values from the lower half.</p>
      <p>
        Inspecting our results, the distance between S
and O seems to behave as a measure of
spontaneity, intended as cross-linguistic frequency and
morphological markedness of a verb: the more the
centroids tend to be set apart, the more the verb
tends to have a morphologically unmarked and
more frequent intransitive form. In fact, we
compared the ranking of 20 alternating verbs
according to the ratio of their cross-linguistic frequency
of transitive and intransitive forms
        <xref ref-type="bibr" rid="ref10">(Haspelmath et
al., 2014)</xref>
        and a ranking based on the centroid
distances of the same verbs. Both these rankings are
plotted in Figure 3: every verb is associated with
its position in the two scales.
      </p>
      <p>Both scales display a common tendency. In
particular a Spearman’s ranking test was performed
over them, yielding a mild positive correlation of
= 0:56391 with a quite strong confidence, i.e.
with p &lt; 0:01.7
5</p>
    </sec>
    <sec id="sec-6">
      <title>Discussion</title>
      <p>
        The representation of lexical sets of Italian
causative-inchoative verbs as vectors was
demonstrated to provide insights into their internal
structure and their relation with spontaneity defined
according to morphological coding and frequency.
The distances of the objects appeared to be
distributed more uniformly, whereas those of the
intransitive subjects more densely and remotely
from the centroid. This difference cannot stem
from the frequency of anaphoric fillers (contrary
to transitive subjects), since both these argument
positions share the discursive function of
introducing new referents, and are hence occupied by fully
referential fillers
        <xref ref-type="bibr" rid="ref6">(Du Bois, 1985)</xref>
        .
      </p>
      <p>
        Moreover, the medians of the distances of the
subject fillers from their centroid were shown to
vary. An interpretation is that they are
sensible to the frequency scale: this implies that
frequently transitive (hence, non-spontaneous) verbs
have semantically less homogeneous sets of
referents, since they are farther from the prototype.
Possibly this discovery can be related with the
fact that non-spontaneous verbs impose less
selectional restrictions on subjects
        <xref ref-type="bibr" rid="ref15 ref18">(McKoon and
Macfarland, 2000)</xref>
        .
      </p>
      <p>
        The lack of a perfect correlation between these
vector distance and frequency measures is maybe
due to errors in the automatic extraction and data
sparseness for the former, or an insufficient sample
7An alternative measure was considered for the ranking:
the cardinality of the S-O intersection weighted by the set
union. In this case, Spearman correlation was = 0:42255,
but it was not significant because p 0:06.
of languages in the typological survey of
Haspelmath et al. (2014) for the latter. A possible
interpretation of the correlation is that the entities
capable of bringing about a change of state and
those that undergo it are indiscernible only for
non-spontaneous verbs. Studies on causer entities
related them not only with the feature of
agentivity, but also in general with the so-called
‘teleological capability’
        <xref ref-type="bibr" rid="ref12">(Higginbotham, 1997)</xref>
        .
6
      </p>
    </sec>
    <sec id="sec-7">
      <title>Conclusion</title>
      <p>Our work provided evidence that lexical sets of
Italian causative-inchoative verbs are continuous
and radial categories, whose distribution around
the prototype vary to a great extent. It is
sensitive to the grammatical role and sometimes to the
position of the verb in the so-called spontaneity
scale. Moreover, a correlation was discovered
between the distance between transitive object and
intransitive subject lexical sets of a given verb and
its cross-linguistic tendency to appear more
frequently as intransitive or as transitive. Figure 4
is a synopsis of this result in the context of the
correlations established in previous works.</p>
      <sec id="sec-7-1">
        <title>Spontaneous ?</title>
      </sec>
      <sec id="sec-7-2">
        <title>Frequently Intransitive =0.65 =0.56</title>
      </sec>
      <sec id="sec-7-3">
        <title>Unmarked Intransitive</title>
        <p>Distant S and O centres</p>
        <p>In Figure 4, solid lines stand for
correlations proven based on cross-linguistic evidence
(frequency-form) and evidence from the Italian
language (frequency-lexical sets). The dotted line,
on the other hand, suggests the existence of and
underlying motivation for the correlations, which
nonetheless remains unproven and undetermined
in its nature. Its possible validation is left to future
research.</p>
        <p>
          Future works should also choose different
pretrained vector models, in order to try and replicate
these results. In particular, the new vector models
could be optimized for similarity through semantic
lexica
          <xref ref-type="bibr" rid="ref8">(Faruqui et al., 2015)</xref>
          or based on syntactic
dependencies (Séaghdha, 2010). The experiments
in this work may be extended to other languages,
either individually or through a multi-lingual word
embedding
          <xref ref-type="bibr" rid="ref7">(Faruqui and Dyer, 2014)</xref>
          .
        </p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Marco</given-names>
            <surname>Baroni</surname>
          </string-name>
          , Silvia Bernardini, Adriano Ferraresi, and
          <string-name>
            <given-names>Eros</given-names>
            <surname>Zanchetta</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>The wacky wide web: a collection of very large linguistically processed web-crawled corpora</article-title>
          .
          <source>Language resources and evaluation</source>
          ,
          <volume>43</volume>
          (
          <issue>3</issue>
          ):
          <fpage>209</fpage>
          -
          <lpage>226</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Bernd</given-names>
            <surname>Bohnet</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Very high accuracy and fast dependency parsing is not a contradiction</article-title>
          .
          <source>In Proceedings of the 23rd International Conference on Computational Linguistics</source>
          , pages
          <fpage>89</fpage>
          -
          <lpage>97</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Michela</given-names>
            <surname>Cennamo</surname>
          </string-name>
          and
          <string-name>
            <given-names>Elisabetta</given-names>
            <surname>Jezek</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>The anticausative alternation in italian. I luoghi della traduzione</article-title>
          , pages
          <fpage>809</fpage>
          -
          <lpage>823</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Georgiana</given-names>
            <surname>Dinu</surname>
          </string-name>
          , Angeliki Lazaridou, and
          <string-name>
            <given-names>Marco</given-names>
            <surname>Baroni</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Improving zero-shot learning by mitigating the hubness problem</article-title>
          .
          <source>workshop contribution at ICLR</source>
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <source>Robert MW Dixon</source>
          .
          <year>1994</year>
          . Ergativity. Cambridge University Press.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>John W Du Bois</surname>
          </string-name>
          .
          <year>1985</year>
          .
          <article-title>Competing motivations</article-title>
          . Iconicity in syntax, pages
          <fpage>343</fpage>
          -
          <lpage>365</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Manaal</given-names>
            <surname>Faruqui</surname>
          </string-name>
          and
          <string-name>
            <given-names>Chris</given-names>
            <surname>Dyer</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Improving vector space word representations using multilingual correlation.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Manaal</given-names>
            <surname>Faruqui</surname>
          </string-name>
          , Jesse Dodge,
          <string-name>
            <surname>Sujay K. Jauhar</surname>
          </string-name>
          , Chris Dyer, Eduard Hovy, and
          <string-name>
            <surname>Noah</surname>
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Smith</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Retrofitting word vectors to semantic lexicons</article-title>
          .
          <source>In Proceedings of NAACL.</source>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Patrick</given-names>
            <surname>Hanks</surname>
          </string-name>
          .
          <year>1996</year>
          .
          <article-title>Contextual dependency and lexical sets</article-title>
          .
          <source>International Journal of Corpus Linguistics</source>
          ,
          <volume>1</volume>
          (
          <issue>1</issue>
          ):
          <fpage>75</fpage>
          -
          <lpage>98</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Martin</given-names>
            <surname>Haspelmath</surname>
          </string-name>
          , Andreea Calude, Michael Spagnol, Heiko Narrog, and
          <string-name>
            <given-names>Elif</given-names>
            <surname>Bamyaci</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Coding causal-noncausal verb alternations: A formfrequency correspondence explanation</article-title>
          .
          <source>Journal of Linguistics</source>
          ,
          <volume>50</volume>
          (
          <issue>03</issue>
          ):
          <fpage>587</fpage>
          -
          <lpage>625</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Martin</given-names>
            <surname>Haspelmath</surname>
          </string-name>
          .
          <year>1993</year>
          .
          <article-title>More on the typology of inchoative/causative verb alternations</article-title>
          .
          <source>Causatives and transitivity</source>
          ,
          <volume>23</volume>
          :
          <fpage>87</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>James</given-names>
            <surname>Higginbotham</surname>
          </string-name>
          .
          <year>1997</year>
          .
          <article-title>Location and causation</article-title>
          . Ms., University of Oxford.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>Elisabetta</given-names>
            <surname>Jezek</surname>
          </string-name>
          and
          <string-name>
            <given-names>Patrick</given-names>
            <surname>Hanks</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>What lexical sets tell us about conceptual categories</article-title>
          .
          <source>Lexis</source>
          ,
          <volume>4</volume>
          (
          <issue>7</issue>
          ):
          <fpage>22</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Eric</given-names>
            <surname>Joanis</surname>
          </string-name>
          , Suzanne Stevenson, and
          <string-name>
            <given-names>David</given-names>
            <surname>James</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>A general feature space for automatic verb classification</article-title>
          .
          <source>Natural Language Engineering</source>
          ,
          <volume>14</volume>
          (
          <issue>03</issue>
          ):
          <fpage>337</fpage>
          -
          <lpage>367</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>Anna</given-names>
            <surname>Korhonen</surname>
          </string-name>
          , Genevieve Gorrell, and
          <string-name>
            <surname>Diana McCarthy</surname>
          </string-name>
          .
          <year>2000</year>
          .
          <article-title>Statistical filtering and subcategorization frame acquisition</article-title>
          .
          <source>In Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora</source>
          , pages
          <fpage>199</fpage>
          -
          <lpage>206</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <given-names>George</given-names>
            <surname>Lakoff</surname>
          </string-name>
          .
          <year>1987</year>
          .
          <article-title>Women, fire, and dangerous things: What categories reveal about the mind</article-title>
          . Cambridge University Press.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <given-names>Diana</given-names>
            <surname>McCarthy</surname>
          </string-name>
          .
          <year>2000</year>
          .
          <article-title>Using semantic preferences to identify verbal participation in role switching alternations</article-title>
          .
          <source>In Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference</source>
          , pages
          <fpage>256</fpage>
          -
          <lpage>263</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <given-names>Gail</given-names>
            <surname>McKoon</surname>
          </string-name>
          and
          <string-name>
            <given-names>Talke</given-names>
            <surname>Macfarland</surname>
          </string-name>
          .
          <year>2000</year>
          .
          <article-title>Externally and internally caused change of state verbs</article-title>
          .
          <source>Language</source>
          , pages
          <fpage>833</fpage>
          -
          <lpage>858</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <given-names>Tomas</given-names>
            <surname>Mikolov</surname>
          </string-name>
          , Kai Chen, Greg Corrado, and
          <string-name>
            <given-names>Jeffrey</given-names>
            <surname>Dean</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Efficient estimation of word representations in vector space</article-title>
          . In Workshop at ICLR.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <given-names>Simonetta</given-names>
            <surname>Montemagni</surname>
          </string-name>
          , Nilda Ruimy, and
          <string-name>
            <given-names>Vito</given-names>
            <surname>Pirrelli</surname>
          </string-name>
          .
          <year>1995</year>
          .
          <article-title>Ringing things which nobody can ring. a corpus-based study of the causative-inchoative alternation in italian</article-title>
          .
          <source>Textus online only. 8</source>
          (
          <issue>1995</issue>
          ),
          <source>N. 2</source>
          ,
          <year>1995</year>
          ,
          <volume>8</volume>
          (
          <issue>2</issue>
          ):
          <fpage>1000</fpage>
          -
          <lpage>1020</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <given-names>James</given-names>
            <surname>Pustejovsky</surname>
          </string-name>
          .
          <year>1995</year>
          .
          <article-title>The generative lexicon</article-title>
          . The MIT Press.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <surname>Eleanor H Rosch</surname>
          </string-name>
          .
          <year>1973</year>
          .
          <article-title>Natural categories</article-title>
          .
          <source>Cognitive psychology</source>
          ,
          <volume>4</volume>
          (
          <issue>3</issue>
          ):
          <fpage>328</fpage>
          -
          <lpage>350</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <given-names>Tanja</given-names>
            <surname>Samardzic</surname>
          </string-name>
          and
          <string-name>
            <given-names>Paola</given-names>
            <surname>Merlo</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>The meaning of lexical causatives in cross-linguistic variation</article-title>
          .
          <source>Linguistic Issues in Language Technology</source>
          ,
          <volume>7</volume>
          (
          <issue>12</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>14</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <given-names>Sabine</given-names>
            <surname>Schulte Im Walde</surname>
          </string-name>
          .
          <year>2000</year>
          .
          <article-title>Clustering verbs semantically according to their alternation behaviour</article-title>
          .
          <source>In Proceedings of the 18th conference on Computational linguistics-Volume</source>
          <volume>2</volume>
          , pages
          <fpage>747</fpage>
          -
          <lpage>753</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <string-name>
            <given-names>Diarmuid O</given-names>
            <surname>Séaghdha</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Latent variable models of selectional preference</article-title>
          .
          <source>In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics</source>
          , pages
          <fpage>435</fpage>
          -
          <lpage>444</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>