=Paper=
{{Paper
|id=Vol-1885/15
|storemode=property
|title=Complex
Predicates with Light Verbs in VALLEX: From Formal Model to
Lexicographic Description
|pdfUrl=https://ceur-ws.org/Vol-1885/15.pdf
|volume=Vol-1885
|authors=Václava Kettnerová,Markéta Lopatková
|dblpUrl=https://dblp.org/rec/conf/itat/KettnerovaL17
}}
==Complex
Predicates with Light Verbs in VALLEX: From Formal Model to
Lexicographic Description==
J. Hlaváčová (Ed.): ITAT 2017 Proceedings, pp. 15–22 CEUR Workshop Proceedings Vol. 1885, ISSN 1613-0073, c 2017 V. Kettnerová, M. Lopatková Complex Predicates with Light Verbs in VALLEX: From Formal Model to Lexicographic Description Václava Kettnerová and Markéta Lopatková Charles University, Faculty of Mathematics and Physics, Prague, Czech Republic {kettnerova,lopatkova}@ufal.mff.cuni.cz Abstract: Natural languages are typically character- sentence is thus formed by valency complementations ized by a large area where grammar and lexicon of both the light verb and the predicative noun. In overlap. Complex predicates with light verbs repre- contrast, the sentence with the predicative verb uza- sent a typical language phenomenon at the lexicon- vřı́t ‘close; turn off’, see e.g. (2), is characterized by grammar interface. Their theoretically adequate rep- two participants, ‘Agent’ and ‘Affected object’, being resentation thus requires a close interplay between the evoked by the verb, they are expressed on the surface lexicon and the grammar. In this paper, we introduce as valency complementations of the given verb. a formal model for the lexicographic description of Czech complex predicates of the given type. The cen- (1) FrancieSb−verb uzavřela s NěmeckemInObj−verb tral type of Czech complex predicates are composed dohoduObj−verb o neútočenı́Atr−noun . of light verbs and predicative nouns. We demonstrate ‘France made an agreement with Germany on that although their syntactic structure formation is non-aggression.’ highly complex, it still exhibits enough regularity to (2) HasičiSb−verb uzavřeli přı́vodObj−verb plynu. be captured by formal rules. ‘Firemen turned off the gas main.’ 1 Motivation Although the contribution of light verbs and pred- icative nouns to the syntactic structure formation of Complex predicates with light verbs (CPs) consist of CPs has been put under scrutiny within various theo- two syntactic units, a light verb (LV) and a predicative retical frameworks – see e.g. argument merger formu- noun (PN) (or, sporadically, a predicative adjective or lated within the Government Binding theory [2], ar- adverb); this verb-noun pair forms a single predica- gument fusion [3] and argument composition within tive unit, as for example dát radu ‘give advice’, dostat the Lexical-Functional Grammar [4], and the study rozkaz ‘get an order’, mı́t radost ‘be happy’ (lit. have by Alonso Ramos drawing on the Meaning ↔ Text joy), or uzavřı́t dohodu ‘make an agreement’. Due to Theory [5] – many of its aspects still remain unclear. their complex characteristics, CPs proven to be chal- Czech, as an inflectional language encoding syntac- lenging for syntactic theories as well as for natural tic relations by morphological cases, provides a great language processing tasks. opportunity to study the distribution of valency com- Complex predicates with light verbs are character- plementations in syntactic structures of CPs since ized by a discrepancy in their syntax and semantics morphological forms of valency complementations [1]: whereas the meaning of a CP is primarily ex- serve as valuable clues for determining whether a cer- pressed by the predicative noun, forming thus the tain valency complementation belongs to the light semantic core of the CP, it is the semantically im- verb or to the predicative noun. However, none of the poverished light verb which serves as the syntactic works focused on Czech CPs provides an explicit de- center of a sentence. We can exemplify this discrep- scription of the syntactic structure formation of CPs, ancy on the CP uzavřı́t dohodu ‘make an agreement’, see esp. [6, 7]. as used in (1). This CP is semantically character- In this paper, we summarize our theoretical results ized by three participants, namely ‘Party 1’ (Francie described earlier and relate the proposed model with ‘France’), ‘Party 2’ (Německo ‘Germany’), and ‘Obli- an extensive data annotation, see esp. [8, 9, 10]. We gation’ (neútočenı́ ‘non-aggression’), all these partic- focus on the deep and surface structure of CPs, mainly ipants are provided by the predicative noun dohoda with respect to the contribution of valency comple- ‘agreement’. However, two of these participants – mentations to the syntactic structure of CPs made ‘Party 1’ and ‘Party 2’ – are expressed in the surface by the light verb and by the predicative noun and structure of the sentence not as nominal but as verbal with respect to the role of coreference between the modifications, namely as the subject and as the indi- complementations in these structures (Section 3). On rect object, while only the participant ‘Obligation’ is the basis of our theoretical findings, we propose an expressed as a nominal modification, namely as its at- economic and linguistically informed formal model of tribute, see (1). The syntactic structure of the given CPs consisting of a grammatical part (Section 3) and 16 V. Kettnerová, M. Lopatková a lexical part (Section 4). Finally, grounded on exten- representing light verbs and predicative nouns were sive data annotation, we introduce an overall typology enriched with attributes that allow a user to derive of CPs based on their coreferential characteristics and valency structures of the whole CPs – these attributes provide basic statistics for Czech CPs (Section 5). are thoroughly described in Section 4. Grammar component. The grammar component rep- 2 VALLEX and FGD Framework resents a part of the overall grammar of Czech, it stores formal rules directly related to the valency The proposed representation of CPs is elaborated structure of verbs. This component serves for an eco- within the Functional Generative Description (FGD), nomic description of systematic changes in the valency a stratificational and dependency-oriented theoretical structure of verbs associated with various syntactic linguistic framework [11]. One of the core concepts phenomena, esp. with passivization and reciprocity. of FGD is that of valency [12]: at the layer of lin- It also comprises rules allowing for the derivation of guistically structured meaning (also the deep syntac- deep and surface syntactic structures of CPs. These tic layer), it is the valency that provides the structure rules are presented in Section 3. of a dependency tree. The valency theory of FGD has been applied in several valency lexicons, esp. PDT- Vallex1 [13] and VALLEX2 [14], and verified on exten- 3 Grammar Component: Formation sive corpus data, esp. within the Prague Dependency of Deep and Surface Syntactic Treebank (PDT)3 . VALLEX, being the most elabo- Structures of CPs rated lexicon of Czech verbs, forms a solid basis for the lexical component of FGD. 3.1 Deep Syntactic Structure For the purpose of representation of language phe- The deep syntactic structure of CPs is formed by both nomena bridging between the grammar and the lexi- valency complementations from the valency frame of con (e.g., diatheses and reciprocity), VALLEX is di- the light verb and complementations from the frame vided into a lexical part (i.e., the data component) of the predicative noun. and a grammatical part (i.e., the grammar compo- nent) [15, 16]. This division proves to be useful also for the representation of CPs. Predicative nouns. The valency frame of a pred- icative noun describes the usage of the noun in nom- Data component. The central organizing concept of inal structures. Individual valency complementations the lexical part of VALLEX is the concept of lexeme. are semantically saturated – they correspond to in- The lexeme associates a set of lexical forms, repre- dividual semantic participants characterizing a sit- senting the verb in an utterance, with a set of lexical uation denoted by the noun, as can be exemplified units, corresponding to their individual senses. on the predicative noun dohoda PN ‘agreement’, see its The data component consists of an inventory of lex- valency frame and example illustrating its nominal ical units of verbs with their respective valency frames structure in (3) and the correspondence between its underlying their deep syntactic structures. Each va- valency complementations and its semantic partici- lency frame is modeled as a sequence of frame slots pants in (4): corresponding to valency complementations of a verb labeled by (rather coarse-grained) deep syntactic roles (3) dohoda PN ‘agreement’: such as ‘Actor’ (ACT), ‘Patient’ (PAT), ‘Addressee’ ACT2,pos ADDRs+7 PATna+6,o+6,inf,aby,zda,že,cont (ADDR), ‘Effect’ (EFF), ‘Direction’, ‘Location’, ‘Man- dohoda FrancieParty 1,ACT(2) ner’, etc. Further, the information on obligatoriness s NěmeckemParty 2,ADDR(s+7) o neútočenı́Obligation,PAT(o+6) (‘?’ in front of a role label indicates its optionality in ‘the agreement of France with Germany this text) and on possible morphological forms (here on non-aggression’ in subscript) is specified for each valency complemen- ACT ⇔ Party 1 tation. Each lexical unit can be further described by (4) ADDR ⇔ Party 2 additional syntactic and syntactic-semantic informa- PAT ⇔ Obligation tion, e.g., on reciprocity, diatheses (as e.g. passiviza- tion), syntactico-semantic class etc. Light verbs. The deep structure of a light verb is For the lexicographic representation of CPs, the formed by its valency frame, with one position (la- VALLEX lexicon was extended to cover also predica- beled CPHR) reserved for a predicative noun. A sin- tive nouns. In addition, the respective lexical units gle light verb may be characterized by different deep 1 http://lindat.mff.cuni.cz/services/PDT-Vallex/ syntactic structures, i.e., described by different va- 2 http://ufal.mff.cuni.cz/vallex/3.0/ lency frames which combine with different predicative 3 http://ufal.mff.cuni.cz/pdt3.0 nouns, see e.g. the light verb uzavřı́t LV in (5) and (7). Complex Predicates with Light Verbs in VALLEX: From Formal Model to Lexicographic Description 17 Light verbs, being (to some extent) semantically PN that form the given CP. A crucial role in the for- bleached, do not evoke any semantic participants. As mation of the deep syntactic structure of a CP plays (i) a result, their valency complementations are char- the number of semantic participants involved in a sit- acterized primarily as semantically underspecified uation denoted by the CP, and (ii) coreferential rela- deep syntactic positions, see schemes provided in (6) tions between the valency complementations of the LV and (8) (compare also with [5]). and the PN [8]. The deep syntactic structure of a CP thus consists of: (5) uzavřı́t LV ‘make’ ACT1 ADDRs+7 CPHR4 • all nominal valency complementations, as they (directly) correspond to semantic participants; ACT ⇔ 0/ (6) ADDR ⇔ 0/ • all verbal valency complementations, as their se- CPHR ⇔ PN mantic saturation is acquired in one of the fol- (This LV combines, e.g., with the PNs dohoda lowing ways: ‘agreement’ and sázka ‘bet’.) – the CPHR valency position, as it is reserved for (7) uzavřı́t LV ‘end, conclude’: the predicative noun; ACT1 CPHR4 – the verbal valency complementation cor- responding to the ‘Instigator’ participant (if ACT ⇔ 0/ present); (8) CPHR ⇔ PN – other verbal valency complementations, as they (This LV combines, e.g., with the PNs debata corefer with individual nominal valency comple- ‘discussion’ and vyšetřovánı́ ‘inquiry’.) mentations. The only exception when a light verb contributes Let us exemplify the deep structure formation on its semantic participant is represented by CPs with the example of the CP uzavřı́t dohodu ‘make an agree- causative LVs. The causative LVs are seen as initiating ment’. The predicative noun dohodaPN ‘agreement’ the event denoted by the predicative noun selecting is characterized by three semantic participants corre- the given verb. These verbs thus contribute the ‘In- sponding to three valency complementations of this stigator’ participant (and the nouns their respective noun, as indicated in (3) and (4). The light verb semantic participants). For example, the LV uzavřı́tLV uzavřı́tLV ‘make’ is characterized by the valency frame ‘close’ that is instantiated, e.g., in the CP uzavřı́t provided in (5). The CPHR position of the light verb is přı́stup ‘close an access’ represents the causative LV, filled with the PN dohoda ‘agreement’, the remaining with the ‘Instigator’ mapped onto its ACT, see the valency complementations ACT and ADDR of the light valency frame of this verb (9) and the scheme of the verb enter into coreference with the ACT and ADDR mapping of semantic participants and valency com- of the given predicative noun, respectively (thus they plementations (10): obtain their sematic capacity from the given nominal complementations), see scheme (11), the sentence be- (9) uzavřı́t LV ‘close’: low and the deep dependency tree of the given CP in ACT1 CPHR4 ?BEN3 Figure 1: ACT ⇔ Instigator (11) uzavřı́t dohodu ‘make an agreement’:4 (10) CPHR ⇔ PN ACTLV ↔ ACTPN ⇔ Party 1 BEN ⇔ 0/ ADDRLV ↔ ADDRPN ⇔ Party 2 (This LV combines, e.g., with the PN přı́stup PATPN ⇔ Obligation ‘access’.) CPHRLV ⇔ dohodaPN Within CPs, semantically underspecified valency com- FrancieParty 1 uzavřela s NěmeckemParty 2 dohoduPN plementations of LVs acquire semantic capacity via o neútočenı́Obligation . coreference with valency complementations of the ‘France made an agreement with Germany on predicative nouns with which they form CPs. These non-aggression.’ coreferential relations between valency complemen- In many cases, a predicative noun can select differ- tations of LVs and complementations of PNs thus ent light verbs (and thus create different CPs), and so characterize the deep syntactic structure of individ- makes it possible to embed the expressed event “into ual CPs. different general semantic scenarios and thus to per- spectivize it from the point of view of different partici- pants” [8]. For example, the predicative noun rozkaz PN Complex predicates with light verbs. The deep 4 In the schemes, correspondence between semantic partici- syntactic structure of a CP is formed via an interplay pants and valency complementations is marked with ⇔ whereas between the valency frames of the respective LV and ↔ is reserved for coreference relations. 18 V. Kettnerová, M. Lopatková uzavřít • As verbal modifications, all valency complemen- tations from the valency frame of the light verb ACT ADDR dohoda.CPHR are primarily expressed in the surface structure, namely:6 (i) the valency complementation filled by the ACT ADDR PAT predicative noun (the CPHR functor): it is obliga- torily expressed in the surface structure as a ver- bal modification; Figure 1: The deep dependency structure of the CP (ii) the valency complementation corresponding uzavřı́t dohodu ‘make an agreement’ . to ‘Instigator’ (if present): it can be expressed in the surface structure only as a verbal modifica- selects either the light verb dát LV ‘to give’, or the light tion; verb dostat LV ‘to get’. This noun evokes three semantic (iii) other verbal valency complementations: they participants, namely ‘Speaker’, ‘Recipient’, and ‘In- are primarily expressed in the surface structure formation’. When it selects the light verb dát LV ‘to as verbal modifications, too. give’, the situation expressed by this noun is viewed • As nominal modifications: from the perspective of the ‘Speaker’ as it occupies the (iv) those valency complementations from the va- prominent subject position given by the ACT of the lency frame of the predicative noun that are not light verb, see example (12), while selecting the light in coreference with verbal ones are primarily ex- verb dostat LV ‘to get’, the situation is perspectivized pressed in the surface structure.7 from the ‘Recipient’, see example (13). (12) GenerálSpeaker,ACT−LV dal rozkaz For instance, within the CP uzavřı́t dohodu ‘make vojákůmRecipient,ADDR−LV k ústupuInformation,PAT−PN . an agreement’, the following valency complementa- ‘The general gave soldiers the order to retreat.’ tions are expressed in the surface structure: all the valency complementations of the LV uzavřı́t ‘to make’ (13) VojáciRecipient,ACT−LV dostali od generálaSpeaker,ORIG−LV (see its valency frame in (5)) are expressed as verbal rozkaz k ústupuInformation,PAT−PN . modifications on the surface, namely: CPHR reserved ‘Soldiers got the order to retreat by the gen- for the predicative noun dohoda ‘agreement’ (principle eral.’ (i)) in the direct object position, the verbal ACT and ADDR in the subject position and the indirect object 3.2 Surface Syntactic Structure position, respectively (principle (iii)) (these valency complementations refer to the ‘Party 1’ and ‘Party 2’ The theoretical analysis supported by the extensive via coference with the ACT and ADDR of the PN , see empirical data annotation has revealed that with CPs scheme (11)). From the valency complementations in Czech, each semantic participant is typically ex- of the PN dohoda ‘agreement’ (the valency frame in pressed in the surface sentence just once.5 Despite (3)), only PAT (referring to ‘Obligation’, not being the fact that semantic participants are contributed to in coreference with any verbal complementation) is CPs – with the exception of the verbal ‘Instigator’ – by expressed on the surface as a nominal modification predicative nouns, Czech CPs have a strong tendency (principle (iv)); the remaining ACT and ADDR com- to express these participants in the surface structure plementations of this noun (being in coreference with as verbal modifications, see as well [7]. Namely, those the verbal ACT and ADDR ) are subject to systemic participants characterizing a CP that are referred to ellipsis; see the example sentence below and its surface by both valency complementations of the PN as well dependency tree in Figure 2: as (via coreference) complementations of the LV are primarily expressed on the surface as the verbal mod- FrancieParty 1,ACT−LV uzavřela s NěmeckemParty 2,ADDR−LV ifications. On the other hand, those participants that dohoduPN,CPHR−LV o neútočenı́Obligation,PAT−PN . are mapped only onto valency complementations of the PN are realized as the nominal modifications. 6 We disregard the cases of valency complementations un- As a result, the rules governing the formation of expressed on the surface due to their optionality, actual ellipsis, the surface syntactic structure of Czech CPs can be generalization etc. 7 In some cases, a nominal valency complementation summarized as follows: coreferring with a verbal one may be alternatively expressed 5 The only exception is represented by the semantic partici- in the surface structure as a nominal modification, see e.g. pant mapped onto the nominal ACT: under certain conditions, (a) S NěmeckemParty 2,ACT−LV FrancieParty 1,ACT−LV uzavřela this participant can be expressed twice, both as a verbal and dohoduPN,CPHR−LV . vs. as a nominal modification (e.g., Vrchnı́ komisařAgens,ACT(1)−LV již (b) FrancieParty 1,ACT−LV uzavřela dohoduPN,CPHR−LV svéAgens,ACT(pos)−PN vyšetřovánı́PN zločinuIncident,PAT(2)−PN uzavřelLV . s NěmeckemParty 2,ACT−PN ., ‘The chief inspector has already concluded his investigation of with the ‘Party 2’ participant (s Německem) preferably ana- the crime.’). lyzed as a verbal (in (a)) or a nominal (in (b)) modification. Complex Predicates with Light Verbs in VALLEX: From Formal Model to Lexicographic Description 19 uzavřela.Pred selects an appropriate LV). The value of the map at- tribute is a list of pairs of coreferring complementa- Francie.Sb s.AuxP dohodu.Obj tions. Figure 4 illustrates three lexical units for three Německem.Obj o.AuxP PNs, namely, dohoda PN ‘agreement’ (see also (3)), vyšetřovánı́ PN ‘investigation’, and přı́stup PN ‘access’. neútočení.Atr Each PN can be assigned more than one attribute map reflecting different coreference relations; in such Figure 2: The surface dependency structure of the cases, the map attributes are co-indexed with the rele- sentence Francie uzavřela s Německem dohodu o neú- vant lvc attributes to allow for the correct formation točenı́. ‘France made an agreement with Germany on of the CPs structures. non-aggression.’ (simplified) Attribute instig. The information on the mapping ‘France made an agreement with Germany on of the ‘Instigator’ onto a valency complementation of non-aggression.’ relevant LVs is recorded in the attribute instig at- tached to the verbal valency frame, see lexical unit 3 in Figure 3. 4 Data Component: Interlinking If a LV forms syntactic structures with different PNs Lexical Units characterized by different coreferential relations, the instig attribute is co-indexed with the respective lvc As was shown above, the deep and surface syntactic attribute, containing the list of references to PNs that structures of CPs are formed as a combination of va- select the LV with the ‘Instigator’. lency structures of respective predicative nouns and light verbs, with respect to the coreference between uzavíratimpf, uzavřítpf their individual valency complementations. The pro- cess of both the deep and surface structure formation 1 LV (impf: sjednávat; pf: sjednat) `make' -frame: ACT1 ADDRs+7 CPHR4 is regular enough to be described on the rule basis. -example: Firmy uzavíraly s Lucemburskem tajné dohody. These rules operate on the information provided by -lvc: dohoda-1, kompromis-1, `agreement, compromise, the data component of the lexicon. kontrakt-1, obchod-1, contract, trade, pakt-1, sázka-1, smlouva-1 pact, bet, contract' In the data component of the VALLEX lexicon, in- dividual lexical units of verbs and predicative nouns 2 LV (impf: ukončovat; pf: ukončit) ` end; conclude' are described. In addition to the core valency in- -frame: ACT1 CPHR4 formation in a form of valency frames, these lexical -example: Policie uzavírá vyšetřování všech tří případů. -lvc: debata-1, vyšetřování-1 ` discussion, investigation' units carry three special attributes linking the respec- tive pairs of lexical units of the PN and LV allowing 3 LV (impf: zamezovat; pf: zamezit) `close; end, terminate' for the derivation of both deep and surface syntac- -frame: ACT1 CPHR4 ?BEN3 tic structures of the whole complex predicate, namely -example: Dohoda ale uzavírá přístup na hranici. -lvc: přístup-1 `access' attributes lvc, map and instig. -instig: ACT … Attribute lvc. Respective lexical units of LVs and PNs that form CPs are linked by the attribute lvc, the value of which is a list of references to respective lex- Figure 3: Three lexical units for the LV uzavřı́t, which ical units. This attribute is attached to lexical units are instantiated, e.g., in the CPs uzavřı́t dohodu ‘make of predicative nouns and (for user’s convenience) to an agreement’, uzavřı́t vyšetřovánı́ ‘close an investi- lexical units of light verbs as well. Figure 3 illustrates gation’, and uzavřı́t přı́stup ‘close an access’, respec- three lexical units for the LV uzavřı́t LV ‘make; end, tively (simplified). conclude; close, terminate’ (see also (5), (7) and (9)). If a LV forms syntactic structures with different PNs characterized by different coreferential relations more 5 Corpus Data Analysis instances of the attribute lvc (indexed with numbers) are assigned to the relevant lexical unit. The following Tables 1 and 2 summarize the cor- pus analysis of Czech CPs formed by 129 verb lem- Attribute map. The information on the coreference mas from the VALLEX lexicon (those LVs were se- between valency complementations of LVs and com- lected that have at least one valency frame with the plementations of PNs is provided in the attribute map. CPHR functor in the PDT corpus, see Section 2). The This attribute is attached to PNs (as it is the PN that CPs were extracted from the Czech National Corpus, 20 V. Kettnerová, M. Lopatková dohoda causativity of LVs was found (122 cases, i.e. almost 1 ujednání; domluva `agreement' 12% of CPs). These CPs are formed by PNs char- -frame: ACT2,pos ADDRs+7 PATna+6,o+6,inf,aby,zda,že,cont acterized by the semantic participants ‘Experiencer’ -example: dohoda Francie s Německem o neútočení and ‘Stimulus’. Two situations occur with these CPs. -lvc: uzavírat/uzavřít-1, vypovídat/vypovědět-5 First, the valency complementation of a PN corre- -map: ACTPN-ACTLV & ADDRPN-ADDRLV … sponding to ‘Stimulus’ enters in coreference with ACT of the LV with which the given PN forms the CP (as vyšetřování exemplified in (14), Figure 5); in this case, the LV 1 objasňování; prozkoumávání `investigation' behaves as non-causative verb. Second, the given -frame: ACT2,pos PAT2,pos complementation of a PN is not in coreference with -example: vyšetřování všech odhalených případů zpronevěry any verbal complementation; in this case, the LV con- -lvc: uzavírat/uzavřít-2, vést-5 -map: ACTPN-ACTLV … tributes the ‘Instigator’ to the CP (example (15), Fig- ure 6). For example, with the CP vyvolat protest, přístup the semantic participant ‘Stimulus’ given by the PN 1 možnost někam vstoupit; přistoupení `access' protest PN ‘protest’ mapped onto PAT of the noun ei- -frame: ACT1 DIR3do+2,k+3,na+4 ther enters in coreference with the ACT of the LV -example: přístup na hranici; přístup na trh práce vyvolat LV ‘to raise’, see example (14), or remains with- -lvc: otvírat/otevírat/otevřít-1, uzavírat/uzavřít-3 out coreference, see example (15). … -map: ACTPN-BENLV (14) StavbaStimulus,ACT−LV dálnice vyvolala u obyvatelExperiencer,LOC−LV protestyPN,CPHR−LV . Figure 4: Three lexical units for the PNs dohoda ‘The construction of the motorway has ‘agreement’, vyšetřovánı́ ‘investigation’, and přı́stup prompted protests of the inhabitants.’ ‘access’, respectively (simplified). (15) StavbaInstigator,ACT−LV dálnice vyvolala u obyvatelExperiencer,LOC−LV protestyPN,CPHR−LV proti SYN2010, by the Word Sketch Engine [17] allowing to postupuStimulus,PAT−PN radnı́ch. identify for each verb lemma its nominal collocates ex- ‘The construction of the motorway has pressed as its direct object (function has obj4). From prompted protests of the inhabitants against the obtained list of collocations, only those nominal the decision of councillors.’ collocates were indicated by human annotators that represent PNs (560 noun lemmas in total). As a key criterion for identifying CPs, the coreference between vyvolat.PRED the ACT of the noun and some of valency complemen- tations of the LV has been adopted [18]. This criterion stavba.ACT obyvatel.LOC protest.CPHR was satisfied by 1,025 collocations, which represent the most frequent and semantically salient CPs of the selected light verbs. dálnice ACT PAT The identified CPs were further annotated with re- spect to the coreference between valency complemen- tations of the LV and PN and with respect to the mapping of ‘Instigator’ (where it was relevant), see Figure 5: The deep dependency structure of the non- esp. [10]. Tables 1 and 2 summarize results of the causative example (14) (simplified). annotation process. Table 1 contains those CPs the light verbs of which behave unambiguously with re- spect to the causative feature, i.e., they are either vyvolat.PRED non-causative (0/ in the ‘Instig’ column), or causative. With the CPs with causative light verbs, the Insti- stavba.ACT obyvatel.LOC protest.CPHR gator was mapped either onto verbal ACT, or onto verbal ORIG. In the annotation, 12 types of corefer- ential relation between verbal and nominal valency dálnice ACT postup.PAT complementations were identified; the most frequent was represented by the coreference between ACT of radní the light verb and ACT of the predicative noun (506 CPs, i.e. almost 50 % of all analyzed CPs). Figure 6: The deep dependency structure of the In the annotation, a specific type of CPs character- causative example (15) (simplified). ized by an ambiguous character with respect to the Complex Predicates with Light Verbs in VALLEX: From Formal Model to Lexicographic Description 21 Table 1: Unambiguous Czech CPs identified in the corpus data, sorted according to causativity of LVs and types of coreference between verbal and nominal valency complementations. ‘Instig’ coreference # % examples 0/ ACTPN – ACTLV 506 49.4 mı́t chut’, vést život, uzavřı́t debatu, uza- vřı́t vyšetřovánı́ ACTPN – ACTLV & ADDRPN – ADDRLV 120 11.7 dát rozkaz, poskytnout rozhovor, uzavřı́t dohodu, uzavřı́t sázku ACTPN – ACTLV & PATPN – ADDRLV 93 9.1 navázat vztah ACTPN – ORIGLV & ADDRPN – ACTLV 28 2.7 dostat nabı́dku, zı́skat informace ACTPN – ORIGLV & PATPN – ACTLV 22 2.1 dostat ránu, dostat pokutu ACTPN – ACTLV & PATPN – DIR3LV 28 2.7 obracet pozornost, položit důraz ACTPN – ACTLV & PATPN – LOCLV 22 2.1 najı́t inspiraci, najı́t potěšenı́ ACTPN – LOCLV & PATPN – ACTLV 22 2.1 najı́t odezvu, nalézt pochopenı́ ACTLV ACTPN – ADDRLV 53 5.2 dát naději, vynést slávu, vzı́t odvahu ACTPN – LOCLV 26 2.5 probouzet podezı́ravost, vzbudit zdánı́ ACTPN – BENLV 8 0.8 zvednout náladu, otevřı́t přı́stup, uzavřı́t přı́stup ORIGLV ACTPN – ACTLV 18 1.8 dostat přı́ležitost, zı́skat výhodu Table 2: Ambiguous Czech CPs identified in the corpus data, sorted according to causativity of LVs and types of coreference between verbal and nominal valency complementations. without ‘Instigator’ with ‘Instigator’ # % examples coreference ‘Instig’ coreference ACTPN – LOCLV & PATPN – ACTLV ACTLV ACTPN – LOCLV 92 9.0 vyvolat protest, budit důvěru ACTPN – ADDRLV & PATPN – ACTLV ACTLV ACTPN – ADDRLV 23 2.2 přinést radost, činit obtı́ž ACTPN – LOCLV & ORIGPN – ACTLV ACTLV ACTPN – LOCLV 3 0.3 vzbudit pocit ACTPN – LOCLV & ADDRPN – ACTLV ACTLV ACTPN – LOCLV 4 0.4 vyvolat podezřenı́ 6 Conclusion project of the Ministry of Education, Youth and Sports of the Czech Republic (project LM2015071). In this paper, we have summarized results of our anal- This work has been using language resources dis- ysis of Czech complex predicates with light verbs. tributed by the LINDAT/CLARIN project of the We have described its lexicographic model based on Ministry of Education, Youth and Sports of the Czech a close cooperation of the lexical and grammar com- Republic (project LM2015071). ponent. Although our proposal is primarily designed for the Valency Lexicon of Czech verbs VALLEX, we suppose that its main tenets can be easily adopted by References other lexical resources as well. Finally, we have intro- [1] Algeo, J.: Having a look at the expanded predicate. duced the annotation of a large collection of linguistic In Aarts, B., Meyer, C.F., eds.: The Verb in Contem- data which will be integrated in the VALLEX lexicon porary English: Theory and Description. Cambridge soon. University Press, Cambridge (1995) 203–217 [2] Grimshaw, J., Mester, A.: Light verbs and θ - Acknowledgements marking. Linguistic inquiry 19 (1988) 205–232 [3] Butt, M.: The light verb jungle: Still hacking away. The work on this project has been supported by the In Amberber, M., Baker, B., Harvey, M., eds.: Com- grant of the Czech Science Foundation (project GA15- plex Predicates in Cross-Linguistic Perspective. Cam- 09979S) and partially by the LINDAT/CLARIN bridge University Press, Cambridge (2010) 48–78 22 V. Kettnerová, M. Lopatková [4] Hinrichs, E., Kathol, A., Nakazawa, T.: Complex [18] Kettnerová, V., Lopatková, M., Bejček, E., Predicates in Nonderivational Syntax. Syntax and Se- Vernerová, A., Podobová, M.: Corpus based iden- mantics 30. Academic Press, San Diego (1998) tification of czech light verbs. In Gajdošová, K., [5] Alonso Ramos, M.: Towards the synthesis of sup- Žáková, A., eds.: Proceedings of the Seventh Inter- port verb constructions: Distribution of syntactic ac- national Conference Slovko 2013; Natural Language tants between the verb and the noun. In Wanner, L., Processing, Corpus Linguistics, E-learning, Lüden- Mel’čuk, I.A., eds.: Selected Lexical and Grammat- scheid, Germany, Slovak National Corpus, L’. Štúr ical Issues in the Meaning-Text Theory. John Ben- Institute of Linguistics, Slovak Academy of Sciences, jamins Publishing Company, Amsterdam, Philadel- RAM-Verlag (2013) 118–128 phia (2007) 97–137 [6] Radimský, J.: Verbo-nominálnı́ predikát s kategoriál- nı́m slovesem. Editio Universitatis Bohemiae Merid- ionalis, České Budějovice (2010) [7] Macháčková, E.: Constructions with verbs and ab- stract nouns in Czech (analytical predicates). In Čmejrková, S., Štı́cha, F., eds.: The Syntax of Sen- tence and Text: A Festschrift for František Daneš. Volume 42 of Linguistic and Literary Studies in East- ern Europe. John Benjamins Publishing Company, Amsterdam, Philadelphia (1994) 365–374 [8] Kettnerová, V., Lopatková, M.: At the lexicon- grammar interface: The case of complex predicates in the functional generative description. In Hajičová, E., Nivre, J., eds.: Proceedings of Depling 2015, Up- psala, Sweden, Uppsala University (2015) 191–200 [9] Kettnerová, V.: Syntaktická struktura komplexnı́ch predikátů. Slovo a slovesnost (2017) (in print). [10] Kettnerová, V., Lopatková, M.: Ke koreferenci u komplexnı́ch predikátů s kategoriálnı́m slovesem. Korpus – gramatika – axiologie (2017) (submitted). [11] Sgall, P., Hajičová, E., Panevová, J.: The Meaning of the Sentence in Its Semantic and Pragmatic Aspects. Reidel, Dordrecht (1986) [12] Panevová, J.: Valency Frames and the Meaning of the Sentence. In Luelsdorff, P.A., ed.: The Prague School of Structural and Functional Linguistics. John Ben- jamins Publishing Company, Amsterdam/Philadel- phia (1994) 223–243 [13] Urešová, Z.: Valence sloves v Pražském závislost- nı́m korpusu. Ústav formálnı́ a aplikované lingvistiky, Praha (2011) [14] Lopatková, M., Kettnerová, V., Bejček, E., Vernerová, A., Žabokrtský, Z.: Valenčnı́ slovnı́k českých sloves VALLEX. Karolinum, Praha (2016) [15] Kettnerová, V., Lopatková, M., Bejček, E.: The Syntax-Semantics Interface of Czech Verbs in the Valency Lexicon. In: Proceedings of the XV EU- RALEX International Congress, Oslo, University of Oslo (2012) 434–443 [16] Lopatková, M., Kettnerová, V.: Alternations: From Lexicon to Grammar And Back Again. In Hajičová, E., Boguslavsky, I., eds.: Proceedings of the Work- shop on Grammar and Lexicon: Interactions and In- terfaces (GramLex), Ōsaka, Japan, ICCL, The COL- ING 2016 Organizing Committee (2016) 18–27 [17] Kilgarriff, A., Baisa, V., Bušta, J., Jakubı́ček, M., Kovář, V., Michelfeit, J., Rychlý, P., Suchomel, V.: The sketch engine: ten years on. Lexicography ASIALEX 1 (2014) 7–36