=Paper= {{Paper |id=Vol-1885/15 |storemode=property |title=Complex Predicates with Light Verbs in VALLEX: From Formal Model to Lexicographic Description |pdfUrl=https://ceur-ws.org/Vol-1885/15.pdf |volume=Vol-1885 |authors=Václava Kettnerová,Markéta Lopatková |dblpUrl=https://dblp.org/rec/conf/itat/KettnerovaL17 }} ==Complex Predicates with Light Verbs in VALLEX: From Formal Model to Lexicographic Description== https://ceur-ws.org/Vol-1885/15.pdf
J. Hlaváčová (Ed.): ITAT 2017 Proceedings, pp. 15–22
CEUR Workshop Proceedings Vol. 1885, ISSN 1613-0073, c 2017 V. Kettnerová, M. Lopatková



                     Complex Predicates with Light Verbs in VALLEX:
                     From Formal Model to Lexicographic Description

                                          Václava Kettnerová and Markéta Lopatková

                         Charles University, Faculty of Mathematics and Physics, Prague, Czech Republic
                                           {kettnerova,lopatkova}@ufal.mff.cuni.cz

     Abstract: Natural languages are typically character-            sentence is thus formed by valency complementations
     ized by a large area where grammar and lexicon                  of both the light verb and the predicative noun. In
     overlap. Complex predicates with light verbs repre-             contrast, the sentence with the predicative verb uza-
     sent a typical language phenomenon at the lexicon-              vřı́t ‘close; turn off’, see e.g. (2), is characterized by
     grammar interface. Their theoretically adequate rep-            two participants, ‘Agent’ and ‘Affected object’, being
     resentation thus requires a close interplay between the         evoked by the verb, they are expressed on the surface
     lexicon and the grammar. In this paper, we introduce            as valency complementations of the given verb.
     a formal model for the lexicographic description of
     Czech complex predicates of the given type. The cen-              (1) FrancieSb−verb uzavřela s NěmeckemInObj−verb
     tral type of Czech complex predicates are composed                    dohoduObj−verb o neútočenı́Atr−noun .
     of light verbs and predicative nouns. We demonstrate                  ‘France made an agreement with Germany on
     that although their syntactic structure formation is                  non-aggression.’
     highly complex, it still exhibits enough regularity to            (2) HasičiSb−verb uzavřeli přı́vodObj−verb plynu.
     be captured by formal rules.                                          ‘Firemen turned off the gas main.’

     1    Motivation                                                    Although the contribution of light verbs and pred-
                                                                     icative nouns to the syntactic structure formation of
     Complex predicates with light verbs (CPs) consist of            CPs has been put under scrutiny within various theo-
     two syntactic units, a light verb (LV) and a predicative        retical frameworks – see e.g. argument merger formu-
     noun (PN) (or, sporadically, a predicative adjective or         lated within the Government Binding theory [2], ar-
     adverb); this verb-noun pair forms a single predica-            gument fusion [3] and argument composition within
     tive unit, as for example dát radu ‘give advice’, dostat       the Lexical-Functional Grammar [4], and the study
     rozkaz ‘get an order’, mı́t radost ‘be happy’ (lit. have        by Alonso Ramos drawing on the Meaning ↔ Text
     joy), or uzavřı́t dohodu ‘make an agreement’. Due to           Theory [5] – many of its aspects still remain unclear.
     their complex characteristics, CPs proven to be chal-              Czech, as an inflectional language encoding syntac-
     lenging for syntactic theories as well as for natural           tic relations by morphological cases, provides a great
     language processing tasks.                                      opportunity to study the distribution of valency com-
        Complex predicates with light verbs are character-           plementations in syntactic structures of CPs since
     ized by a discrepancy in their syntax and semantics             morphological forms of valency complementations
     [1]: whereas the meaning of a CP is primarily ex-               serve as valuable clues for determining whether a cer-
     pressed by the predicative noun, forming thus the               tain valency complementation belongs to the light
     semantic core of the CP, it is the semantically im-             verb or to the predicative noun. However, none of the
     poverished light verb which serves as the syntactic             works focused on Czech CPs provides an explicit de-
     center of a sentence. We can exemplify this discrep-            scription of the syntactic structure formation of CPs,
     ancy on the CP uzavřı́t dohodu ‘make an agreement’,            see esp. [6, 7].
     as used in (1). This CP is semantically character-                 In this paper, we summarize our theoretical results
     ized by three participants, namely ‘Party 1’ (Francie           described earlier and relate the proposed model with
     ‘France’), ‘Party 2’ (Německo ‘Germany’), and ‘Obli-           an extensive data annotation, see esp. [8, 9, 10]. We
     gation’ (neútočenı́ ‘non-aggression’), all these partic-      focus on the deep and surface structure of CPs, mainly
     ipants are provided by the predicative noun dohoda              with respect to the contribution of valency comple-
     ‘agreement’. However, two of these participants –               mentations to the syntactic structure of CPs made
     ‘Party 1’ and ‘Party 2’ – are expressed in the surface          by the light verb and by the predicative noun and
     structure of the sentence not as nominal but as verbal          with respect to the role of coreference between the
     modifications, namely as the subject and as the indi-           complementations in these structures (Section 3). On
     rect object, while only the participant ‘Obligation’ is         the basis of our theoretical findings, we propose an
     expressed as a nominal modification, namely as its at-          economic and linguistically informed formal model of
     tribute, see (1). The syntactic structure of the given          CPs consisting of a grammatical part (Section 3) and
16                                                                                                               V. Kettnerová, M. Lopatková

     a lexical part (Section 4). Finally, grounded on exten-      representing light verbs and predicative nouns were
     sive data annotation, we introduce an overall typology       enriched with attributes that allow a user to derive
     of CPs based on their coreferential characteristics and      valency structures of the whole CPs – these attributes
     provide basic statistics for Czech CPs (Section 5).          are thoroughly described in Section 4.

                                                                  Grammar component. The grammar component rep-
     2    VALLEX and FGD Framework                                resents a part of the overall grammar of Czech, it
                                                                  stores formal rules directly related to the valency
     The proposed representation of CPs is elaborated             structure of verbs. This component serves for an eco-
     within the Functional Generative Description (FGD),          nomic description of systematic changes in the valency
     a stratificational and dependency-oriented theoretical       structure of verbs associated with various syntactic
     linguistic framework [11]. One of the core concepts          phenomena, esp. with passivization and reciprocity.
     of FGD is that of valency [12]: at the layer of lin-         It also comprises rules allowing for the derivation of
     guistically structured meaning (also the deep syntac-        deep and surface syntactic structures of CPs. These
     tic layer), it is the valency that provides the structure    rules are presented in Section 3.
     of a dependency tree. The valency theory of FGD has
     been applied in several valency lexicons, esp. PDT-
     Vallex1 [13] and VALLEX2 [14], and verified on exten-        3      Grammar Component: Formation
     sive corpus data, esp. within the Prague Dependency                 of Deep and Surface Syntactic
     Treebank (PDT)3 . VALLEX, being the most elabo-                     Structures of CPs
     rated lexicon of Czech verbs, forms a solid basis for
     the lexical component of FGD.                                3.1       Deep Syntactic Structure
        For the purpose of representation of language phe-
                                                                  The deep syntactic structure of CPs is formed by both
     nomena bridging between the grammar and the lexi-
                                                                  valency complementations from the valency frame of
     con (e.g., diatheses and reciprocity), VALLEX is di-
                                                                  the light verb and complementations from the frame
     vided into a lexical part (i.e., the data component)
                                                                  of the predicative noun.
     and a grammatical part (i.e., the grammar compo-
     nent) [15, 16]. This division proves to be useful also
     for the representation of CPs.                               Predicative nouns. The valency frame of a pred-
                                                                  icative noun describes the usage of the noun in nom-
     Data component. The central organizing concept of            inal structures. Individual valency complementations
     the lexical part of VALLEX is the concept of lexeme.         are semantically saturated – they correspond to in-
     The lexeme associates a set of lexical forms, repre-         dividual semantic participants characterizing a sit-
     senting the verb in an utterance, with a set of lexical      uation denoted by the noun, as can be exemplified
     units, corresponding to their individual senses.             on the predicative noun dohoda PN ‘agreement’, see its
        The data component consists of an inventory of lex-       valency frame and example illustrating its nominal
     ical units of verbs with their respective valency frames     structure in (3) and the correspondence between its
     underlying their deep syntactic structures. Each va-         valency complementations and its semantic partici-
     lency frame is modeled as a sequence of frame slots          pants in (4):
     corresponding to valency complementations of a verb
     labeled by (rather coarse-grained) deep syntactic roles          (3) dohoda PN ‘agreement’:
     such as ‘Actor’ (ACT), ‘Patient’ (PAT), ‘Addressee’                  ACT2,pos ADDRs+7 PATna+6,o+6,inf,aby,zda,že,cont
     (ADDR), ‘Effect’ (EFF), ‘Direction’, ‘Location’, ‘Man-               dohoda FrancieParty 1,ACT(2)
     ner’, etc. Further, the information on obligatoriness                s NěmeckemParty 2,ADDR(s+7) o neútočenı́Obligation,PAT(o+6)
     (‘?’ in front of a role label indicates its optionality in           ‘the agreement of France with Germany
     this text) and on possible morphological forms (here                 on non-aggression’
     in subscript) is specified for each valency complemen-                   ACT         ⇔     Party 1
     tation. Each lexical unit can be further described by            (4)     ADDR        ⇔     Party 2
     additional syntactic and syntactic-semantic informa-                     PAT         ⇔     Obligation
     tion, e.g., on reciprocity, diatheses (as e.g. passiviza-
     tion), syntactico-semantic class etc.                        Light verbs. The deep structure of a light verb is
        For the lexicographic representation of CPs, the          formed by its valency frame, with one position (la-
     VALLEX lexicon was extended to cover also predica-           beled CPHR) reserved for a predicative noun. A sin-
     tive nouns. In addition, the respective lexical units        gle light verb may be characterized by different deep
         1 http://lindat.mff.cuni.cz/services/PDT-Vallex/         syntactic structures, i.e., described by different va-
         2 http://ufal.mff.cuni.cz/vallex/3.0/                    lency frames which combine with different predicative
         3 http://ufal.mff.cuni.cz/pdt3.0                         nouns, see e.g. the light verb uzavřı́t LV in (5) and (7).
Complex Predicates with Light Verbs in VALLEX: From Formal Model to Lexicographic Description                                           17

       Light verbs, being (to some extent) semantically                 PN that form the given CP. A crucial role in the for-
     bleached, do not evoke any semantic participants. As               mation of the deep syntactic structure of a CP plays (i)
     a result, their valency complementations are char-                 the number of semantic participants involved in a sit-
     acterized primarily as semantically underspecified                 uation denoted by the CP, and (ii) coreferential rela-
     deep syntactic positions, see schemes provided in (6)              tions between the valency complementations of the LV
     and (8) (compare also with [5]).                                   and the PN [8]. The deep syntactic structure of a CP
                                                                        thus consists of:
        (5) uzavřı́t LV ‘make’
            ACT1 ADDRs+7 CPHR4                                             • all nominal valency complementations, as they
                                                                             (directly) correspond to semantic participants;
                ACT     ⇔ 0/
        (6)     ADDR ⇔ 0/                                                  • all verbal valency complementations, as their se-
                CPHR ⇔ PN                                                    mantic saturation is acquired in one of the fol-
              (This LV combines, e.g., with the PNs dohoda                   lowing ways:
              ‘agreement’ and sázka ‘bet’.)                                 – the CPHR valency position, as it is reserved for
        (7) uzavřı́t LV ‘end, conclude’:                                    the predicative noun;
            ACT1 CPHR4                                                       – the verbal valency complementation cor-
                                                                             responding to the ‘Instigator’ participant (if
                ACT      ⇔ 0/                                                present);
        (8)
                CPHR ⇔ PN                                                    – other verbal valency complementations, as they
              (This LV combines, e.g., with the PNs debata                   corefer with individual nominal valency comple-
              ‘discussion’ and vyšetřovánı́ ‘inquiry’.)                   mentations.
        The only exception when a light verb contributes                    Let us exemplify the deep structure formation on
     its semantic participant is represented by CPs with                the example of the CP uzavřı́t dohodu ‘make an agree-
     causative LVs. The causative LVs are seen as initiating            ment’. The predicative noun dohodaPN ‘agreement’
     the event denoted by the predicative noun selecting                is characterized by three semantic participants corre-
     the given verb. These verbs thus contribute the ‘In-               sponding to three valency complementations of this
     stigator’ participant (and the nouns their respective              noun, as indicated in (3) and (4). The light verb
     semantic participants). For example, the LV uzavřı́tLV            uzavřı́tLV ‘make’ is characterized by the valency frame
     ‘close’ that is instantiated, e.g., in the CP uzavřı́t            provided in (5). The CPHR position of the light verb is
     přı́stup ‘close an access’ represents the causative LV,           filled with the PN dohoda ‘agreement’, the remaining
     with the ‘Instigator’ mapped onto its ACT, see the                 valency complementations ACT and ADDR of the light
     valency frame of this verb (9) and the scheme of the               verb enter into coreference with the ACT and ADDR
     mapping of semantic participants and valency com-                  of the given predicative noun, respectively (thus they
     plementations (10):                                                obtain their sematic capacity from the given nominal
                                                                        complementations), see scheme (11), the sentence be-
        (9) uzavřı́t LV ‘close’:                                       low and the deep dependency tree of the given CP in
            ACT1 CPHR4 ?BEN3                                            Figure 1:
                ACT      ⇔ Instigator
                                                                          (11) uzavřı́t dohodu ‘make an agreement’:4
       (10)     CPHR ⇔ PN
                                                                                ACTLV        ↔ ACTPN      ⇔ Party 1
                BEN      ⇔ 0/
                                                                                ADDRLV ↔ ADDRPN ⇔ Party 2
              (This LV combines, e.g., with the PN přı́stup
                                                                                                 PATPN    ⇔ Obligation
              ‘access’.)
                                                                                CPHRLV                    ⇔ dohodaPN

     Within CPs, semantically underspecified valency com-               FrancieParty 1 uzavřela s NěmeckemParty 2 dohoduPN
     plementations of LVs acquire semantic capacity via                 o neútočenı́Obligation .
     coreference with valency complementations of the                   ‘France made an agreement with Germany on
     predicative nouns with which they form CPs. These                  non-aggression.’
     coreferential relations between valency complemen-                    In many cases, a predicative noun can select differ-
     tations of LVs and complementations of PNs thus                    ent light verbs (and thus create different CPs), and so
     characterize the deep syntactic structure of individ-              makes it possible to embed the expressed event “into
     ual CPs.                                                           different general semantic scenarios and thus to per-
                                                                        spectivize it from the point of view of different partici-
                                                                        pants” [8]. For example, the predicative noun rozkaz PN
     Complex predicates with light verbs. The deep                         4 In the schemes, correspondence between semantic partici-
     syntactic structure of a CP is formed via an interplay             pants and valency complementations is marked with ⇔ whereas
     between the valency frames of the respective LV and                ↔ is reserved for coreference relations.
18                                                                                                                                  V. Kettnerová, M. Lopatková

                  uzavřít                                                                • As verbal modifications, all valency complemen-
                                                                                           tations from the valency frame of the light verb
              ACT               ADDR               dohoda.CPHR
                                                                                           are primarily expressed in the surface structure,
                                                                                           namely:6
                                                                                           (i) the valency complementation filled by the
                                                ACT        ADDR          PAT               predicative noun (the CPHR functor): it is obliga-
                                                                                           torily expressed in the surface structure as a ver-
                                                                                           bal modification;
     Figure 1: The deep dependency structure of the CP                                     (ii) the valency complementation corresponding
     uzavřı́t dohodu ‘make an agreement’ .                                                to ‘Instigator’ (if present): it can be expressed in
                                                                                           the surface structure only as a verbal modifica-
     selects either the light verb dát LV ‘to give’, or the light                         tion;
     verb dostat LV ‘to get’. This noun evokes three semantic                              (iii) other verbal valency complementations: they
     participants, namely ‘Speaker’, ‘Recipient’, and ‘In-                                 are primarily expressed in the surface structure
     formation’. When it selects the light verb dát LV ‘to                                as verbal modifications, too.
     give’, the situation expressed by this noun is viewed                               • As nominal modifications:
     from the perspective of the ‘Speaker’ as it occupies the                              (iv) those valency complementations from the va-
     prominent subject position given by the ACT of the                                    lency frame of the predicative noun that are not
     light verb, see example (12), while selecting the light                               in coreference with verbal ones are primarily ex-
     verb dostat LV ‘to get’, the situation is perspectivized                              pressed in the surface structure.7
     from the ‘Recipient’, see example (13).
      (12) GenerálSpeaker,ACT−LV dal rozkaz                                              For instance, within the CP uzavřı́t dohodu ‘make
           vojákůmRecipient,ADDR−LV k ústupuInformation,PAT−PN .                    an agreement’, the following valency complementa-
           ‘The general gave soldiers the order to retreat.’                           tions are expressed in the surface structure: all the
                                                                                       valency complementations of the LV uzavřı́t ‘to make’
      (13) VojáciRecipient,ACT−LV dostali od generálaSpeaker,ORIG−LV                 (see its valency frame in (5)) are expressed as verbal
           rozkaz k ústupuInformation,PAT−PN .                                        modifications on the surface, namely: CPHR reserved
           ‘Soldiers got the order to retreat by the gen-                              for the predicative noun dohoda ‘agreement’ (principle
           eral.’                                                                      (i)) in the direct object position, the verbal ACT and
                                                                                       ADDR in the subject position and the indirect object
     3.2     Surface Syntactic Structure                                               position, respectively (principle (iii)) (these valency
                                                                                       complementations refer to the ‘Party 1’ and ‘Party 2’
     The theoretical analysis supported by the extensive
                                                                                       via coference with the ACT and ADDR of the PN , see
     empirical data annotation has revealed that with CPs
                                                                                       scheme (11)). From the valency complementations
     in Czech, each semantic participant is typically ex-
                                                                                       of the PN dohoda ‘agreement’ (the valency frame in
     pressed in the surface sentence just once.5 Despite
                                                                                       (3)), only PAT (referring to ‘Obligation’, not being
     the fact that semantic participants are contributed to
                                                                                       in coreference with any verbal complementation) is
     CPs – with the exception of the verbal ‘Instigator’ – by
                                                                                       expressed on the surface as a nominal modification
     predicative nouns, Czech CPs have a strong tendency
                                                                                       (principle (iv)); the remaining ACT and ADDR com-
     to express these participants in the surface structure
                                                                                       plementations of this noun (being in coreference with
     as verbal modifications, see as well [7]. Namely, those
                                                                                       the verbal ACT and ADDR ) are subject to systemic
     participants characterizing a CP that are referred to
                                                                                       ellipsis; see the example sentence below and its surface
     by both valency complementations of the PN as well
                                                                                       dependency tree in Figure 2:
     as (via coreference) complementations of the LV are
     primarily expressed on the surface as the verbal mod-                             FrancieParty 1,ACT−LV uzavřela s NěmeckemParty 2,ADDR−LV
     ifications. On the other hand, those participants that                            dohoduPN,CPHR−LV o neútočenı́Obligation,PAT−PN .
     are mapped only onto valency complementations of
     the PN are realized as the nominal modifications.                                     6 We disregard the cases of valency complementations un-
        As a result, the rules governing the formation of                              expressed on the surface due to their optionality, actual ellipsis,
     the surface syntactic structure of Czech CPs can be                               generalization etc.
                                                                                           7 In some cases, a nominal valency complementation
     summarized as follows:
                                                                                       coreferring with a verbal one may be alternatively expressed
          5 The only exception is represented by the semantic partici-                 in the surface structure as a nominal modification, see e.g.
     pant mapped onto the nominal ACT: under certain conditions,                       (a) S NěmeckemParty 2,ACT−LV FrancieParty 1,ACT−LV uzavřela
     this participant can be expressed twice, both as a verbal and                     dohoduPN,CPHR−LV . vs.
     as a nominal modification (e.g., Vrchnı́ komisařAgens,ACT(1)−LV již             (b)       FrancieParty 1,ACT−LV   uzavřela     dohoduPN,CPHR−LV
     svéAgens,ACT(pos)−PN vyšetřovánı́PN zločinuIncident,PAT(2)−PN uzavřelLV .   s NěmeckemParty 2,ACT−PN .,
     ‘The chief inspector has already concluded his investigation of                   with the ‘Party 2’ participant (s Německem) preferably ana-
     the crime.’).                                                                     lyzed as a verbal (in (a)) or a nominal (in (b)) modification.
Complex Predicates with Light Verbs in VALLEX: From Formal Model to Lexicographic Description                                                      19

                        uzavřela.Pred                                   selects an appropriate LV). The value of the map at-
                                                                        tribute is a list of pairs of coreferring complementa-
           Francie.Sb      s.AuxP         dohodu.Obj                    tions. Figure 4 illustrates three lexical units for three
                               Německem.Obj    o.AuxP                   PNs, namely, dohoda PN ‘agreement’ (see also (3)),
                                                                        vyšetřovánı́ PN ‘investigation’, and přı́stup PN ‘access’.
                                                   neútočení.Atr           Each PN can be assigned more than one attribute
                                                                        map reflecting different coreference relations; in such
     Figure 2: The surface dependency structure of the                  cases, the map attributes are co-indexed with the rele-
     sentence Francie uzavřela s Německem dohodu o neú-              vant lvc attributes to allow for the correct formation
     točenı́. ‘France made an agreement with Germany on                of the CPs structures.
     non-aggression.’ (simplified)
                                                                        Attribute instig. The information on the mapping
     ‘France made an agreement with Germany on                          of the ‘Instigator’ onto a valency complementation of
     non-aggression.’                                                   relevant LVs is recorded in the attribute instig at-
                                                                        tached to the verbal valency frame, see lexical unit 3
                                                                        in Figure 3.
     4     Data Component: Interlinking                                    If a LV forms syntactic structures with different PNs
           Lexical Units                                                characterized by different coreferential relations, the
                                                                        instig attribute is co-indexed with the respective lvc
     As was shown above, the deep and surface syntactic                 attribute, containing the list of references to PNs that
     structures of CPs are formed as a combination of va-               select the LV with the ‘Instigator’.
     lency structures of respective predicative nouns and
     light verbs, with respect to the coreference between                   uzavíratimpf, uzavřítpf
     their individual valency complementations. The pro-
     cess of both the deep and surface structure formation                   1 LV (impf: sjednávat; pf: sjednat) `make'
                                                                            -frame: ACT1 ADDRs+7 CPHR4
     is regular enough to be described on the rule basis.                   -example: Firmy uzavíraly s Lucemburskem tajné dohody.
     These rules operate on the information provided by                     -lvc:     dohoda-1, kompromis-1,       `agreement, compromise,
     the data component of the lexicon.                                               kontrakt-1, obchod-1,         contract, trade,
                                                                                      pakt-1, sázka-1, smlouva-1 pact, bet, contract'
        In the data component of the VALLEX lexicon, in-
     dividual lexical units of verbs and predicative nouns                   2 LV (impf: ukončovat; pf: ukončit) ` end; conclude'
     are described. In addition to the core valency in-                     -frame: ACT1 CPHR4
     formation in a form of valency frames, these lexical                   -example: Policie uzavírá vyšetřování všech tří případů.
                                                                            -lvc:    debata-1, vyšetřování-1        ` discussion, investigation'
     units carry three special attributes linking the respec-
     tive pairs of lexical units of the PN and LV allowing                   3 LV (impf: zamezovat; pf: zamezit) `close; end, terminate'
     for the derivation of both deep and surface syntac-                    -frame: ACT1 CPHR4 ?BEN3
     tic structures of the whole complex predicate, namely                  -example: Dohoda ale uzavírá přístup na hranici.
                                                                            -lvc:     přístup-1 `access'
     attributes lvc, map and instig.                                        -instig: ACT

                                                                                                            …
     Attribute lvc. Respective lexical units of LVs and
     PNs that form CPs are linked by the attribute lvc, the
     value of which is a list of references to respective lex-          Figure 3: Three lexical units for the LV uzavřı́t, which
     ical units. This attribute is attached to lexical units            are instantiated, e.g., in the CPs uzavřı́t dohodu ‘make
     of predicative nouns and (for user’s convenience) to               an agreement’, uzavřı́t vyšetřovánı́ ‘close an investi-
     lexical units of light verbs as well. Figure 3 illustrates         gation’, and uzavřı́t přı́stup ‘close an access’, respec-
     three lexical units for the LV uzavřı́t LV ‘make; end,            tively (simplified).
     conclude; close, terminate’ (see also (5), (7) and (9)).
        If a LV forms syntactic structures with different PNs
     characterized by different coreferential relations more
                                                                        5     Corpus Data Analysis
     instances of the attribute lvc (indexed with numbers)
     are assigned to the relevant lexical unit.
                                                                        The following Tables 1 and 2 summarize the cor-
                                                                        pus analysis of Czech CPs formed by 129 verb lem-
     Attribute map. The information on the coreference                  mas from the VALLEX lexicon (those LVs were se-
     between valency complementations of LVs and com-                   lected that have at least one valency frame with the
     plementations of PNs is provided in the attribute map.             CPHR functor in the PDT corpus, see Section 2). The
     This attribute is attached to PNs (as it is the PN that            CPs were extracted from the Czech National Corpus,
20                                                                                                               V. Kettnerová, M. Lopatková


       dohoda                                                      causativity of LVs was found (122 cases, i.e. almost
        1 ujednání; domluva       `agreement'
                                                                   12% of CPs). These CPs are formed by PNs char-
       -frame: ACT2,pos ADDRs+7 PATna+6,o+6,inf,aby,zda,že,cont    acterized by the semantic participants ‘Experiencer’
       -example: dohoda Francie s Německem o neútočení             and ‘Stimulus’. Two situations occur with these CPs.
       -lvc:     uzavírat/uzavřít-1, vypovídat/vypovědět-5         First, the valency complementation of a PN corre-
       -map:     ACTPN-ACTLV & ADDRPN-ADDRLV
                                      …                            sponding to ‘Stimulus’ enters in coreference with ACT
                                                                   of the LV with which the given PN forms the CP (as
       vyšetřování                                                 exemplified in (14), Figure 5); in this case, the LV
        1 objasňování; prozkoumávání        `investigation'        behaves as non-causative verb. Second, the given
       -frame: ACT2,pos PAT2,pos                                   complementation of a PN is not in coreference with
       -example: vyšetřování všech odhalených případů zpronevěry   any verbal complementation; in this case, the LV con-
       -lvc:     uzavírat/uzavřít-2, vést-5
       -map:     ACTPN-ACTLV
                                      …                            tributes the ‘Instigator’ to the CP (example (15), Fig-
                                                                   ure 6). For example, with the CP vyvolat protest,
       přístup                                                     the semantic participant ‘Stimulus’ given by the PN
        1 možnost někam vstoupit; přistoupení `access'             protest PN ‘protest’ mapped onto PAT of the noun ei-
       -frame: ACT1 DIR3do+2,k+3,na+4                              ther enters in coreference with the ACT of the LV
       -example: přístup na hranici; přístup na trh práce          vyvolat LV ‘to raise’, see example (14), or remains with-
       -lvc:     otvírat/otevírat/otevřít-1, uzavírat/uzavřít-3    out coreference, see example (15).
                                      …
       -map:     ACTPN-BENLV

                                                                    (14) StavbaStimulus,ACT−LV          dálnice          vyvolala
                                                                         u obyvatelExperiencer,LOC−LV protestyPN,CPHR−LV .
     Figure 4: Three lexical units for the PNs dohoda
                                                                         ‘The construction of the motorway has
     ‘agreement’, vyšetřovánı́ ‘investigation’, and přı́stup
                                                                         prompted protests of the inhabitants.’
     ‘access’, respectively (simplified).
                                                                    (15) StavbaInstigator,ACT−LV      dálnice        vyvolala
                                                                         u obyvatelExperiencer,LOC−LV protestyPN,CPHR−LV proti
     SYN2010, by the Word Sketch Engine [17] allowing to                 postupuStimulus,PAT−PN radnı́ch.
     identify for each verb lemma its nominal collocates ex-             ‘The construction of the motorway has
     pressed as its direct object (function has obj4). From              prompted protests of the inhabitants against
     the obtained list of collocations, only those nominal               the decision of councillors.’
     collocates were indicated by human annotators that
     represent PNs (560 noun lemmas in total). As a key
     criterion for identifying CPs, the coreference between                       vyvolat.PRED
     the ACT of the noun and some of valency complemen-
     tations of the LV has been adopted [18]. This criterion
                                                                               stavba.ACT      obyvatel.LOC      protest.CPHR
     was satisfied by 1,025 collocations, which represent
     the most frequent and semantically salient CPs of the
     selected light verbs.                                                         dálnice                    ACT      PAT

        The identified CPs were further annotated with re-
     spect to the coreference between valency complemen-
     tations of the LV and PN and with respect to the
     mapping of ‘Instigator’ (where it was relevant), see          Figure 5: The deep dependency structure of the non-
     esp. [10]. Tables 1 and 2 summarize results of the            causative example (14) (simplified).
     annotation process. Table 1 contains those CPs the
     light verbs of which behave unambiguously with re-
     spect to the causative feature, i.e., they are either                     vyvolat.PRED
     non-causative (0/ in the ‘Instig’ column), or causative.
     With the CPs with causative light verbs, the Insti-
                                                                           stavba.ACT       obyvatel.LOC      protest.CPHR
     gator was mapped either onto verbal ACT, or onto
     verbal ORIG. In the annotation, 12 types of corefer-
     ential relation between verbal and nominal valency                         dálnice                    ACT      postup.PAT
     complementations were identified; the most frequent
     was represented by the coreference between ACT of                                                                 radní
     the light verb and ACT of the predicative noun (506
     CPs, i.e. almost 50 % of all analyzed CPs).                   Figure 6: The deep dependency structure of the
        In the annotation, a specific type of CPs character-       causative example (15) (simplified).
     ized by an ambiguous character with respect to the
Complex Predicates with Light Verbs in VALLEX: From Formal Model to Lexicographic Description                                              21


     Table 1: Unambiguous Czech CPs identified in the corpus data, sorted according to causativity of LVs and types
     of coreference between verbal and nominal valency complementations.

         ‘Instig’   coreference                                        #         %    examples
         0/         ACTPN – ACTLV                                    506      49.4    mı́t chut’, vést život, uzavřı́t debatu, uza-
                                                                                      vřı́t vyšetřovánı́
                    ACTPN – ACTLV       &     ADDRPN – ADDRLV        120      11.7    dát rozkaz, poskytnout rozhovor, uzavřı́t
                                                                                      dohodu, uzavřı́t sázku
                    ACTPN – ACTLV       &     PATPN – ADDRLV           93       9.1   navázat vztah
                    ACTPN – ORIGLV      &     ADDRPN – ACTLV           28       2.7   dostat nabı́dku, zı́skat informace
                    ACTPN – ORIGLV      &     PATPN – ACTLV            22       2.1   dostat ránu, dostat pokutu
                    ACTPN – ACTLV       &     PATPN – DIR3LV           28       2.7   obracet pozornost, položit důraz
                    ACTPN – ACTLV       &     PATPN – LOCLV            22       2.1   najı́t inspiraci, najı́t potěšenı́
                    ACTPN – LOCLV       &     PATPN – ACTLV            22       2.1   najı́t odezvu, nalézt pochopenı́
         ACTLV      ACTPN – ADDRLV                                     53       5.2   dát naději, vynést slávu, vzı́t odvahu
                    ACTPN – LOCLV                                      26       2.5   probouzet podezı́ravost, vzbudit zdánı́
                    ACTPN – BENLV                                       8       0.8   zvednout náladu, otevřı́t přı́stup, uzavřı́t
                                                                                      přı́stup
         ORIGLV     ACTPN – ACTLV                                      18       1.8   dostat přı́ležitost, zı́skat výhodu



     Table 2: Ambiguous Czech CPs identified in the corpus data, sorted according to causativity of LVs and types
     of coreference between verbal and nominal valency complementations.

       without ‘Instigator’                             with ‘Instigator’                  #      %     examples
       coreference                                      ‘Instig’ coreference
       ACTPN – LOCLV         &    PATPN – ACTLV         ACTLV      ACTPN – LOCLV           92    9.0    vyvolat protest,          budit
                                                                                                        důvěru
       ACTPN – ADDRLV        &    PATPN – ACTLV         ACTLV      ACTPN – ADDRLV          23    2.2    přinést radost, činit obtı́ž
       ACTPN – LOCLV         &    ORIGPN – ACTLV        ACTLV      ACTPN – LOCLV            3    0.3    vzbudit pocit
       ACTPN – LOCLV         &    ADDRPN – ACTLV        ACTLV      ACTPN – LOCLV            4    0.4    vyvolat podezřenı́



     6        Conclusion                                                project of the Ministry of Education, Youth and
                                                                        Sports of the Czech Republic (project LM2015071).
     In this paper, we have summarized results of our anal-                This work has been using language resources dis-
     ysis of Czech complex predicates with light verbs.                 tributed by the LINDAT/CLARIN project of the
     We have described its lexicographic model based on                 Ministry of Education, Youth and Sports of the Czech
     a close cooperation of the lexical and grammar com-                Republic (project LM2015071).
     ponent. Although our proposal is primarily designed
     for the Valency Lexicon of Czech verbs VALLEX, we
     suppose that its main tenets can be easily adopted by              References
     other lexical resources as well. Finally, we have intro-
                                                                            [1] Algeo, J.: Having a look at the expanded predicate.
     duced the annotation of a large collection of linguistic                   In Aarts, B., Meyer, C.F., eds.: The Verb in Contem-
     data which will be integrated in the VALLEX lexicon                        porary English: Theory and Description. Cambridge
     soon.                                                                      University Press, Cambridge (1995) 203–217
                                                                            [2] Grimshaw, J., Mester, A.:        Light verbs and θ -
     Acknowledgements                                                           marking. Linguistic inquiry 19 (1988) 205–232
                                                                            [3] Butt, M.: The light verb jungle: Still hacking away.
     The work on this project has been supported by the                         In Amberber, M., Baker, B., Harvey, M., eds.: Com-
     grant of the Czech Science Foundation (project GA15-                       plex Predicates in Cross-Linguistic Perspective. Cam-
     09979S) and partially by the LINDAT/CLARIN                                 bridge University Press, Cambridge (2010) 48–78
22                                                                                                            V. Kettnerová, M. Lopatková

      [4] Hinrichs, E., Kathol, A., Nakazawa, T.: Complex              [18] Kettnerová, V., Lopatková, M., Bejček, E.,
          Predicates in Nonderivational Syntax. Syntax and Se-              Vernerová, A., Podobová, M.: Corpus based iden-
          mantics 30. Academic Press, San Diego (1998)                      tification of czech light verbs. In Gajdošová, K.,
      [5] Alonso Ramos, M.: Towards the synthesis of sup-                   Žáková, A., eds.: Proceedings of the Seventh Inter-
          port verb constructions: Distribution of syntactic ac-            national Conference Slovko 2013; Natural Language
          tants between the verb and the noun. In Wanner, L.,               Processing, Corpus Linguistics, E-learning, Lüden-
          Mel’čuk, I.A., eds.: Selected Lexical and Grammat-               scheid, Germany, Slovak National Corpus, L’. Štúr
          ical Issues in the Meaning-Text Theory. John Ben-                 Institute of Linguistics, Slovak Academy of Sciences,
          jamins Publishing Company, Amsterdam, Philadel-                   RAM-Verlag (2013) 118–128
          phia (2007) 97–137
      [6] Radimský, J.: Verbo-nominálnı́ predikát s kategoriál-
          nı́m slovesem. Editio Universitatis Bohemiae Merid-
          ionalis, České Budějovice (2010)
      [7] Macháčková, E.: Constructions with verbs and ab-
          stract nouns in Czech (analytical predicates). In
          Čmejrková, S., Štı́cha, F., eds.: The Syntax of Sen-
          tence and Text: A Festschrift for František Daneš.
          Volume 42 of Linguistic and Literary Studies in East-
          ern Europe. John Benjamins Publishing Company,
          Amsterdam, Philadelphia (1994) 365–374
      [8] Kettnerová, V., Lopatková, M.: At the lexicon-
          grammar interface: The case of complex predicates
          in the functional generative description. In Hajičová,
          E., Nivre, J., eds.: Proceedings of Depling 2015, Up-
          psala, Sweden, Uppsala University (2015) 191–200
      [9] Kettnerová, V.: Syntaktická struktura komplexnı́ch
          predikátů. Slovo a slovesnost (2017) (in print).
     [10] Kettnerová, V., Lopatková, M.:          Ke koreferenci
          u komplexnı́ch predikátů s kategoriálnı́m slovesem.
          Korpus – gramatika – axiologie (2017) (submitted).
     [11] Sgall, P., Hajičová, E., Panevová, J.: The Meaning of
          the Sentence in Its Semantic and Pragmatic Aspects.
          Reidel, Dordrecht (1986)
     [12] Panevová, J.: Valency Frames and the Meaning of the
          Sentence. In Luelsdorff, P.A., ed.: The Prague School
          of Structural and Functional Linguistics. John Ben-
          jamins Publishing Company, Amsterdam/Philadel-
          phia (1994) 223–243
     [13] Urešová, Z.: Valence sloves v Pražském závislost-
          nı́m korpusu. Ústav formálnı́ a aplikované lingvistiky,
          Praha (2011)
     [14] Lopatková, M., Kettnerová, V., Bejček, E.,
          Vernerová, A., Žabokrtský, Z.: Valenčnı́ slovnı́k
          českých sloves VALLEX. Karolinum, Praha (2016)
     [15] Kettnerová, V., Lopatková, M., Bejček, E.: The
          Syntax-Semantics Interface of Czech Verbs in the
          Valency Lexicon. In: Proceedings of the XV EU-
          RALEX International Congress, Oslo, University of
          Oslo (2012) 434–443
     [16] Lopatková, M., Kettnerová, V.: Alternations: From
          Lexicon to Grammar And Back Again. In Hajičová,
          E., Boguslavsky, I., eds.: Proceedings of the Work-
          shop on Grammar and Lexicon: Interactions and In-
          terfaces (GramLex), Ōsaka, Japan, ICCL, The COL-
          ING 2016 Organizing Committee (2016) 18–27
     [17] Kilgarriff, A., Baisa, V., Bušta, J., Jakubı́ček, M.,
          Kovář, V., Michelfeit, J., Rychlý, P., Suchomel, V.:
          The sketch engine: ten years on. Lexicography
          ASIALEX 1 (2014) 7–36