Tagging Semantic Types for Verb Argument Positions
             Francesca Della Moretta                             Anna Feltracco
          University of Pavia / Pavia, Italy          Fondazione Bruno Kessler / Trento, Italy
         francesca.dellamoretta01                        University of Pavia / Pavia, Italy
           @universitadipavia.it                      University of Bergamo / Bergamo, Italy
                                                             feltracco@fbk.eu

                  Elisabetta Jezek                             Bernardo Magnini
           University of Pavia / Pavia, Italy         Fondazione Bruno Kessler / Trento, Italy
                jezek@unipv.it                                 magnini@fbk.eu

                    Abstract                           sets (Hanks and Jezek, 2008) (Jezek and Hanks,
                                                       2010). However, despite the large theoretical in-
    English. Verb argument positions can be
                                                       terest, there is still a limited amount of empiri-
    described by the semantic types that char-
                                                       cal evidences (e.g. annotated corpora) that can be
    acterise the words filling that position. We
                                                       used to support linguistic theories. Particularly, for
    investigate a number of linguistic issues
                                                       the Italian language, there has been no systematic
    underlying the tagging of an Italian corpus
                                                       attempt to annotate a corpus with semantic tagging
    with the semantic types provided by the
                                                       of verb argument positions
    T-PAS (Typed Predicate Argument Struc-
    ture) resource. We report both quantita-              In this paper we assume a corpus-based per-
    tive data about the tagging and a qualita-         spective, and we focus on manually tagging verb
    tive analysis of cases of disagreement be-         argument positions in a corpus with their corre-
    tween two annotators.                              sponding semantic classes, selected from those
                                                       used in the T-PAS resource (Jezek et al., 2014).
    Italiano. Le posizioni argomentali di un           We make use of an explicit set of semantic cate-
    verbo possono essere descritte dai tipi se-        gories (i.e., an ontology of Semantic Types), hi-
    mantici che caratterizzano le parole che           erarchically organised (e.g. inanimate subsumes
    riempiono quella posizione. Nel contrib-           food): we are interested in a qualitative analy-
    uto affrontiamo alcune problematiche lin-          sis, a rather different perspective with respect to
    guistiche sottostanti l’annotazione di un          recent works that exploit distributional properties
    corpus italiano con i tipi semantici usati         of words filling argument positions (Ponti et al.,
    nella risorsa T-PAS (Typed Predicate Ar-           2016; Ponti et al., 2017). We run a pilot annotation
    gument Structure). Riportiamo sia dati             on a corpus of sentences. We aim at investigat-
    quantitativi relativi all’annotazione, sia         ing how human annotators assign semantic types
    una analisi qualitativa dei casi di disac-         to argument fillers, and to what extent they agree
    cordo tra due annotatori.                          or disagree.
                                                          A mid term goal of this work is the extension of
1   Introduction                                       the T-PAS resource with a corpus of annotated sen-
Words that fill a certain verb argument position       tences aligned with the T-PASs of the verbs (see
are characterised for their semantic properties.       section 2). This would have a twofold impact:
For instance, the fillers of the object position of    it would allow a corpus based linguistic investi-
the verb “eat” are typically required to share the     gation, and it would provide a unique dataset for
fact that they are edible objects, like “meat” and     training semantic parsers for Italian.
“bread”. There has been a vast literature in lexi-        The paper is structured as follows. Section 2
cal semantics addressing, under different perspec-     introduces T-PAS and the ontology of semantic
tives, this issue, including the notion of selec-      types used in the resource. Section 3 describes
tional preferences (Resnik, 1997) (McCarthy and        the annotation task and the guidelines for annota-
Carroll, 2003), the notion of prototypical cate-       tors. Section 4 presents the annotated corpus and
gories (Rosch, 1973), and the notion of lexical        the data of the inter-annotator agreement. Finally,
Section 5 discusses the most interesting phenom-                       che vendeva anche .prodotti                 1
                                                                                           . . . . . . . . tipici”
                                                                                                           .....
ena that emerged during the annotation exercise.
                                                            We annotate the content word(s) that is the
                                                         head-noun both in case of the noun-phrases (NP)
2    Overview of the T-PAS resource
                                                         (e.g. give a cake)   . . . . . and in case of prepositional-
The T-PAS resource is an inventory of 4241               phrases (PP) (e.g. give a cake                                        . . . . In
                                                                                                      . . . . . to his little son).
Typed Predicate Argument Structures (T-PASs) -           the case the head-noun is a quantifier, the quanti-
for example [[Human]] partecipa a ‘takes part            fier is not tagged but the quantified element is (e.g.
in’ [[Event]] - for 1000 average polysemy Ital-          to give a piece of cake).    .....
ian verbs, acquired from the ItWaC corpus (Baroni           Notice that more than one token can be anno-
and Kilgarriff, 2006) by manual clustering of dis-       tated, e.g. in the case of multiword expressions
tributional information about Italian verbs (Jezek       such as prodotti             . . . . . in Example (1), and more
                                                                    . . . . . . . . .tipici
et al., 2014), following the Corpus Patterns Anal-       than one item can be tagged for the same argument
ysis (CPA) procedure (Hanks, 2004) (Hanks and            position, e.g. in case of coordination, such in [..]
                                                         che vendeva anche prodotti                                                   2
Pustejovsky, 2005) which consists in recognising                                          . . . . . . . . .tipici        ......... .
                                                                                                            . . . . . e cartoline”
the relevant structures of a verb and identifying           In the case an argument is not present in the sen-
the Semantic Types (STs) for their argument slots        tence (for instance, when the subject of the verb is
by generalizing over the lexical sets observed in        unexpressed), we do not signal this lack.
a sample of 250 concordances. The current list of           On the other hand, the annotation accounts for
about 230 semantic types used in the resource (e.g.      the following cases.
human, event, location, artifact - henceforth, STs)         Semantic mismatches. Lexical items are an-
is corpus derived, that is, STs are the result of man-   notated according to the T-PAS; however, the an-
ual generalization over the lexical sets found in the    notator can use a different ST, if she/he thinks the
argument positions in the concordances, for exam-        one specified in the T-PAS does not apply. For
ple in the [[Event]] argument position of parte-         instance, Example (2) reports another instance of
cipare we find gara, riunione, selezione, and so         T-PAS#1 of vendere in which lavoro has been an-
forth. Besides the T-PASs and the hierarchically         notated as [[Activity]], a ST not selected by the
organized list of STs, the resource contains a cor-      T-PAS#1 of vendere in object position (see the T-
pus of sentences that instantiate the different T-       PAS in Example (1)).
PASs for each verb. Each sentence is therefore
                                                             (2)       “il lavoro
                                                                           . . . . . . . come qualsiasi altra cosa può es-
currently tagged with the number of the T-PAS it
                                                                       sere acquistato e venduto.”3
instantiates; the tag is located on the verb. No fur-
ther information is present in the instance except           Syntactic mismatches. We account for cases in
for the T-PAS number.                                    which the syntactic role of the lexical items does
                                                         not match with the one proposed in the T-PAS, e.g.
3    Annotating Semantic Types                           in cases of passive forms of verbs, where the sub-
                                                         ject and prepositional phrase introduced by da cor-
The main goal of the annotation effort reported
                                                         respond respectively to the object and the subject
in this paper is to enrich the annotation already
                                                         of the active construction. In Example (2), lavoro
present in the examples associated with each T-
                                                         is the syntactic subject of the passive clause, and
PAS. Specifically, given a T-PAS of a verb and an
                                                         it is generalized by [[Activity]]) in the object posi-
example from the corpus, we annotate the lexical
                                                         tion of the T-PAS. In such cases we annotate both
items (in the example) generalised by the STs (in
                                                         the ST of the lexical item and its grammatical re-
the T-PAS).
                                                         lation using the one in the T-PAS.
   For instance, Example (1) shows the T-PAS#1
                                                             Pronouns. In case the argument of the verb is
of the verb vendere (Eng. ‘to sell’), and a sentence
                                                         realised as a pronoun, we tag the pronoun with-
associated to it. The task consists in annotating
                                                         out assigning a ST. The pronoun is then linked to
prodotti tipici (Eng. ‘traditional products’) as a
                                                         the noun(s) it refers to, and this noun is actually
lexical item for [[Inanimate]]-obj.
                                                                1
                                                                  Eng. ‘[..] the name of that Brazilian association that was
    (1)   [[Human | Business Enterprise]] vendere        selling traditional
                                                                         . . . . . . . . . . .products’
                                                                                              .........
                                                                2
                                                                  Eng. ‘[..] that was selling traditional
                                                                                                        . . . . . . . . . . . .products
                                                                                                                               . . . . . . . . and
             . . . . . . . . . . | Animal]]
          [[Inanimate
                                                         postcards’
                                                         . . . .3. . . . . .
          “[..] il nome di un’associazione brasiliana             Eng. ‘jobs can be sold and bought just like anything.’
tagged with the ST label. In case the pronoun is           4.1    Inter-annotator Agreement
agglutinated to the verb (i.e. it is found in the same
                                                           In order to assess the reliability of the annotated
token of the verb, e.g. venderla, Eng. ‘to sell it’),
                                                           data, we run an Inter-Annotator Agreement (IAA)
the part of the token corresponding to the pronoun
                                                           test.7 We asked a second annotator to annotate
is specified and, as just specified, the noun is an-
                                                           a sample of 11 T-PASs associated to 3 differ-
notated with the ST.
                                                           ent verbs (i.e., pulire, vendere and sbottonare).
   Impersonal constructions. In case of imper-             These verbs were chosen because they correspond
sonal constructions with an indefinite pronoun, the        to about 10% of the annotated sentences. More-
pronoun is annotated and the ST it refers to is spec-      over, we selected them because they present a low
ified: e.g. In Germania [..] si vende a 10 euro al         or middle degree of polysemy with respect of the
chilo 4 , si is annotated with [[Human]].                  group of 25 verbs initially annotated. The second
   We annotated the examples in T-PAS using CAT            annotator was provided with the task guidelines
(Content Annotation Tool)5 , a general-purpose             and a training session was done to solve potential
text annotation tool (Bartalesi Lenzi et al., 2012).       uncertainties in annotation. The second annotator
                                                           was trained on a selection of corpus instances de-
4       Results of the Pilot Annotation                    rived from verb lemmas, which are not included in
                                                           the evaluation we report here.
The pilot annotation consisted in a selection of              Table 2 shows the results of the IAA for each
3554 sentences extracted from the current version          T-PAS. We measured both the agreement on argu-
of T-PAS6 associated to 25 Italian verbs, selected         ment annotation, calculated with the Dice’s coeffi-
with different levels of polysemy (from a mini-            cient (Rijsbergen, 1979), and the agreement on ST
mum of 2 to a maximum of 10 T-PASs), and ar-               annotation, calculated as the accuracy (Manning et
gument structure. The average polysemy of the 25           al., 2008) among the two annotators. As reported
verbs (i.e. number of senses divided by the num-           in the last row of Table 2, the average agreement
ber of verbs) is 4.08, and for each T-PAS (sense)          is 0.87 for argument annotation, and 0.83 for ST
we have an average of 34.84 annotated sentences.           annotation.
   The annotation was carried out by a master stu-
dent in linguistics, who was trained on the T-PAS                                       Argument        ST
                                                                 T-PAS
                                                                                        Dice’s value    Accuracy
resource, but had no previous experience in anno-
                                                                 Pulire, T-PAS#1           0.83           0.74
tation. The annotator was able to tag the 3554 sen-              Pulire, T-PAS#2             1              1
tences in one month.                                             Sbottonare, T-PAS#1       0.94           0.89
   Table 1 shows the main data of the pilot anno-                Sbottonare, T-PAS#2       0.95           0.98
tation. Overall, we annotated 5342 argument po-                  Sbottonare, T-PAS#3         1              1
sitions expressed in the 3554 sentences, with an                 Sbottonare, T-PAS#4       0.88           0.90
average of 1.5 argument per sentence. Out of the                 Vendere, T-PAS#1          0.87           0.81
                                                                 Vendere, T-PAS#2          0.33            0.5
230 Semantic Types available in the T-PAS ontol-
                                                                 Vendere, T-PAS#3           0.8             1
ogy, 99 have been selected during the annotation,                Vendere, T-PAS#4            1              1
which means that we used about 40% of the STs                    Vendere, T-PAS#5            1              1
contained in the hierarchy.                                      Overall average           0.87           0.83

               Data                      Total                     Table 2: Inter Annotator Agreement.
               # Verbs                   25
               # T-PASs                  102
               # Examples                3554                 A special case is vendere T-PAS#2, which shows
               # Examples per T-PAS      34.84             the lowest score for both argument and STs anno-
               # Semantic Types used     99                tation. The annotation task allowed annotators to
            Table 1: Pilot annotation results.             discard sentences which according to their opin-
                                                           ion did not fit the sense of the T-PAS taken into
                                                           consideration. Vendere T-PAS#2 has only a few
    4
    Eng. ‘In Germany, they sell it at 10 euro per kilo’.   corpus instances, which were mostly discarded or
    5
    https://dh.fbk.eu/resources/
cat-content-annotation-tool                                   7
                                                                Cinková et al.      (2012) held an IAA on pattern-
  6
    http://tpas.fbk.eu                                     identification using the CPA procedure in 30 English verbs.
tagged differently by the two annotators, causing                               ST Expected              ST used
                                                          T-PAS
                                                                                according to the T-PAS   A+B
low agreement in the results for this T-PAS.              Pulire, T-PAS#1                 4                23
                                                          Pulire, T-PAS#2                 3                 4
5     Discussion                                          Sbottonare, T-PAS#1             2                 6
                                                          Sbottonare, T-PAS#2             2                 4
This Section discusses the most interesting phe-          Sbottonare, T-PAS#3             1                 1
                                                          Sbottonare, T-PAS#4             1                 4
nomena that emerged during the annotation ex-             Vendere, T-PAS#1                4                23
ercise, particularly in light of the Inter-annotator      Vendere, T-PAS#2                2                 3
Agreement.                                                Vendere, T-PAS#3                3                 3
                                                          Vendere, T-PAS#4                1                 1
                                                          Vendere, T-PAS#5                1                 1
5.1    Discussion: Argument Tagging
In this paragraph, we focus on the disagreements       Table 3: Expected and used STs in the IAA test.
we found in argument tagging. The annotation
task was difficult because the annotators had to
identify the semantic structure of the verbs, using    specifically this correlation is shown by pulire
syntactic criteria to distinguish whether a lexical    T-PAS#1, sbottonare T-PAS#1,#4, vendere T-
element was an argument or not.                        PAS#1. There are a number of reasons that jus-
   Annotating pronouns was also a very demand-         tify this STs usage. In some cases one annotator
ing process since it implies the identification of     tends to tag the entity denoted by single lexical
co-reference chains. Differences in argument an-       items instead of the generalisations made by the T-
notation between the two annotators, that impact       PASs. This causes a sentence specific annotation
the arguments Dice score, lie mainly in the an-        that employs STs that are end nodes in the hier-
notation of pronouns and in the identification of      archy, which do not correspond to the ones in the
co-referents. One annotator usually tends to an-       reference T-PAS. As future work, we plan to de-
notate all the pronouns contained in an utterance      velop a methodology to normalize the STs to the
whereas the other tags only the pronoun which          appropriate level of abstraction.
is an argument of the verb taken into considera-          There are also linguistic reasons that intervene
tion. In addition, one usually does not identify       in the assignment of different STs to the same lex-
co-referents which are lexically realised at great     ical element. Annotators captured repeatedly the
distance of words from the tagged verb, whereas        phenomenon known as inherent polysemy by tag-
the other sometimes annotates co-referents even if     ging the same lexical elements in two totally dif-
the argument has already been identified. There        ferent ways. An inherent polysemous noun de-
are also differences concerning the extension of       notes, depending on the context, a single aspect
annotation e.g. one interpreted prodotti tipici as     of an entity which is inherently complex, i.e. that
multiword expression and the other did not. Over-      can be described simultaneously by more than
all, we obtained good agreement results, although      one ST (see (Jezek, 2016) and references therein).
some disagreements still remain even if we tried to    An example is provided by the nouns that de-
reduce potential differences in annotation treating    note countries that in our annotation exercise have
as many cases as possible in the guidelines.           been tagged as [[Business Enterprise]], [[Institu-
                                                       tion]] or [[Area]], pointing out their complex na-
5.2    Discussion: Semantic Type Tagging               ture of territorial, politic and economic entity. In
The main goal of this section is to analyse the re-    some cases annotators have privileged different
sults of IAA on ST selection. Annotators used          semantic components in the ST annotation pro-
approximately 40 STs even though their expected        cess. This is due to the context in which the words
number (according to the T-PAS resource) was 11.       are embedded, that determines certain interpreta-
Table 3 represents the ST usage in the IAA exper-      tions instead of others. However, sometimes the
iment for each T-PAS.                                  compositionality principle does not strictly define
  Annotators used approximately the expected           the meaning of an utterance. Hence some lexical
number of semantic types with some T-PASs,             items remain underspecified so that they can re-
while with others they used many more. To              ceive more than one ST at once.
a higher number of STs employed corresponds               For instance in example (3) one annotator
a lower ST accuracy score (see Table 1), more          tagged lente as [[Artifact]] highlighting its nature
of manufactured object, whereas the other has an-         Patrick Hanks and James Pustejovsky. 2005. A pattern
notated the lexical item as [[Physical Object Part]]        dictionary for natural language processing. Revue
                                                            française de linguistique appliquée, 10(2):63–82.
focusing on its nature of constituent element of a
bigger object.                                            Patrick Hanks. 2004. Corpus pattern analysis. In Pro-
                                                            ceedings of the Eleventh EURALEX International
    (3)      “Giles pulisce una lente
                                 . . . . . dei suoi oc-     Congress.
             chiali.”8
                                                          Elisabetta Jezek and Patrick Hanks. 2010. What lex-
                                                             ical sets tell us about conceptual categories. Lexis,
Moreover, there are differences is ST assignment             4(7):22.
caused by regular polysemy (Apresjan, 1974),
systematic alternation of meaning that apply to           Elisabetta Jezek, Bernardo Magnini, Anna Feltracco,
                                                             Alessia Bianchini, and Octavian Popescu. 2014. T-
classes of words (Jezek, 2016). IAA results reveal
                                                             PAS: a resource of corpus-derived types predicate-
regular polysemy patterns for nouns.                         argument structures for linguistic analysis and se-
                                                             mantic processing. In Proceedings of the Ninth In-
6       Conclusions                                          ternational Conference on Language Resources and
                                                             Evaluation (LREC’14).
We performed a pilot experiment to tag the ar-
                                                          Elisabetta Jezek. 2016. The lexicon: an introduction.
guments of verbs, as recorded in the T-PAS re-               Oxford University Press.
source, with their associated semantic type. We
obtained good result in the annotation. By analyz-        Christopher D. Manning, Prabhakar Raghavan, and
ing the cases of inter annotator disagreement, we           Hinrich Schütze. 2008. Introduction to Information
                                                            Retrieval. Cambridge University Press, New York,
were able to identify phenomena which lie at the            NY, USA.
core of such disagreements, such as the presence
of inherent polysemous words. Ongoing work in-            Diana McCarthy and John Carroll. 2003. Disam-
                                                            biguating nouns, verbs, and adjectives using auto-
cludes spelling out the rules for polysemous words          matically acquired selectional preferences. Compu-
tagging more clearly in the guidelines.                     tational Linguistics, 29(4):639–654.
                                                          Edoardo Maria Ponti, Elisabetta Jezek, and Bernardo
References                                                  Magnini. 2016. Grounding the lexical sets of
                                                            causative-inchoative verbs with word embedding. In
Iurii Derenikovich Apresjan. 1974. Regular polysemy.        Proceedings of the Second Italian Conference on
   Linguistics, 32.                                         Computational Linguistic (CLiC-it 2016).

Marco Baroni and Adam Kilgarriff. 2006. Large             Edoardo Maria Ponti, Elisabetta Jezek, and Bernardo
 linguistically-processed web corpora for multiple          Magnini. 2017. Distributed representations of lex-
 languages. In Proceedings of the Eleventh Confer-          ical sets and prototypes in causal alternation verbs.
 ence of the European Chapter of the Association for        Italian Journal of Computational Linguistics, to ap-
 Computational Linguistics: Posters & Demonstra-            pear.
 tions, pages 87–90. Association for Computational
                                                          Philip Resnik. 1997. Selectional preference and sense
 Linguistics.
                                                            disambiguation. In Proceedings of the ACL SIGLEX
                                                            Workshop on Tagging Text with Lexical Semantics:
Valentina Bartalesi Lenzi, Giovanni Moretti, and            Why, What, and How, pages 52–57.
  Rachele Sprugnoli. 2012. Cat: the celct annota-
  tion tool. In Proceedings of the Eight International    CJ van Rijsbergen. 1979. Information retrieval. 1979.
  Conference on Language Resources and Evaluation
  (LREC ‘12), pages 333–338.                              Eleanor H Rosch. 1973. Natural categories. Cognitive
                                                            psychology, 4(3):328–350.
Silvie Cinková, Martin Holub, Adam Rambousek, and
   Lenka Smejkalová. 2012. A database of seman-
   tic clusters of verb usages. In Proceedings of the
   Eighth International Conference on Language Re-
   sources and Evaluation (LREC ‘12), pages 3176–
   3183.

Patrick Hanks and Elisabetta Jezek. 2008. Shimmer-
  ing lexical sets. In Proceedings of the XIII EU-
  RALEX International Congress, pages 391–402.
    8
        Eng.‘Giles cleans a lens of his glasses’