Asymmetries in Extraction From Nominal Copular Sentences: a Challenging Case Study for NLP Tools Paolo Lorusso, Matteo Greco, Cristiano Chesi, Andrea Moro NEtS at Scuola Universitaria Superiore IUSS. P.zza Vittoria 15, I-27100 Pavia (Italy) {paolo.lorusso, matteo.greco, andrea.moro, cristiano.chesi}@iusspavia.it revealed by some widely used Natural Language Abstract Processing (NLP) tools. This leads to poor performance in tasks like Machine Translation In this paper we discuss two types of (MT). nominal copular sentences (Canonical and This argument seems to us especially relevant Inverse, Moro 1997) and we demonstrate in those structural configurations in which a non- how the peculiarities of these two local dependency must be established: in parsing, configurations are hardly considered by for instance, interpreting correctly a wh- standard NLP tools that are currently dependency requires that the dependent (the wh- publicly available. Here we show that phrase) and the dependee (the head selecting the example-based MT tools (e.g. Google wh- phrase as its argument/modifier) are Translate) as well as other NLP tools identified, and the nature of the dependence (UDpipe, LinguA, Stanford Parser, and disambiguated (e.g. argument vs. modifier). In (1) Google Cloud AI API) fail in capturing the we exemplify the special case of a non-local critical distinctions between the two dependency between a wh- PP and a DP it structures in the end producing both wrong depends on (a co-indexed underscore signals the analyses and, possibly as a consequence of possible extraction sites, hence the dependent a non-coherent (or missing) structural constituent; the diacritic “*” prefixes, as usual, analysis, incorrect translations in the case illegal sites): of MT tools. To support the proposed analysis, we present also an empirical (1) [Di quale segnale]i [i telescopi *_ i] hanno study showing that native speakers are Of which signal the telescopes have indeed sensitive to the critical distinctions. scoperto *_i [un’interferenza _ i]? This poses a sharp challenge for NLP tools discovered an interference? that aim at being cognitively plausible or at ‘[which signal]i did the telescopes discover least descriptively adequate (Chowdhury an interference of _ i?’ & Zamparelli 2018). The second DP un’interferenza (an interference) (the internal argument) is the dependee of the wh- 1. Introduction phrase and neither the subject DP nor the predicate can host this wh- dependency instead. The main hypothesis of this paper is that sentence comprehension cannot be achieved independently According to Google Translate (as of 12th July from a coherent structural analysis. To support 2019), this second option seems indeed a viable this claim, we first present a precise structural one: analysis that is critical for recovering the relevant (2) What signal did the telescopes find an dependencies within specific constructions, then interference? we will show that the crucial structural properties captured by the theoretical framework are in fact The translation is ill formed being the internal correctly perceived by native speakers, but not argument of find filled both by the wh- phrase and Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). the DP an interference (which cannot take a wh- Moro (1991, 1997, 2006) showed that these DP as its own argument due to the absence of a two types of copular constructions can be relevant preposition). distinguished on the basis of different diagnostics like agreement on the verb, grammaticality for the In this work we focus on a similar non-local extraction of DPs (Wh- or clitic) and pronominal dependency involving two kinds of copular binding. sentences: Inverse (3.a) and Canonical (3.b). Using these constructions, we will test the Traditionally, copular sentences are analyzed availability of wh- PP sub-extraction from both as involving the raising of a DP from the same the first and the second DP as exemplified in (4). base generated structure (Stowell 1978). Moro (1997, 2018) showed that the predicate DPs (3) a. le foto del muro sono la causa della rivolta the pictures of the wall are the cause of the riot (including there and its equivalents across b. la causa della rivolta sono le foto del muro languages) can be raised along with the subject the cause of the riot are the pictures of-the wall DPs to the preverbal position from the so-called ‘the cause of the riot is the pictures of the wall’ Small Clause (SC) – a structure resulting from merging two DPs (Moro 2000, 2009 Chomsky (4) a. [Di quale rivolta]i le foto del muro sono 2013, Rizzi 2016). In other words, while in of which riot the pictures of_the wall are Canonical copular sentences the subject DP raises la causa _ i ? to the preverbal position and the predicative DP the cause stays in situ inside the small clause in the b. [Di quale muro]i le foto _ i sono postverbal position (4), in the Inverse copular of which riot the pictures of the wall are sentences the predicative DP raises to the la causa della rivolta? preverbal position and the subject DP stays in situ the cause of_the riot inside the small clause in the postverbal position (5). In the first part of this paper (§2), we will briefly (5) Canonical copular sentence structure present an analysis for these constructions, then we will demonstrate that native speakers are IP selectively sensitive both to the copular structural configuration (Canonical vs. Inverse) and to the DPsubj VP extraction site (subject vs. predicate) (§3). In §4 we will test the insensibility of some freely V SC available NLP tools (Google Translate, the Natural Language service of Google Cloud AI ti DPpred API, UDpipe, Stanford Parser and LinguA) to the syntactic oppositions previously discussed. 2. The structure of nominal copular sentences (6) Inverse copular sentence structure Copular sentences are those sentences whose IP main verb is to be (the copula) and its equivalents across languages. A subset of copular sentences is DPpred VP the one involving two DPs, linearly ordered as DP V DP. Those are dubbed nominal copular V SC sentences. In this configuration, a nominal phrase realizes the predicate of the sentence (“the cause…” in (3)) while the other is the subject of DPsubj ti the predicate (“the pictures…” in (3)). According to Moro (1997), nominal copular sentences can be distinguished in two subtypes: Canonical copular 2.1 Asymmetries in copular sentences sentences (3.a) – in which the order is subject- copula-predicative expression – and Inverse These two different representations offer a copular sentences (3b) – in which the order is principled explanation for many asymmetries inverted, i.e. predicative expression-copula- across languages. Distinguishing between subject. Canonical and Inverse copular sentences is not always easy or possible (see Jespersen 1924 as the Canonical configuration allow the extraction cited in Moro 1997). However, agreement and from the predicate DP, whereas all the other kinds PP/ne sub-extraction offer robust diagnostics. For of extraction – from the subject in Canonical and example, verbs invariably agree with the subject from both the subject and the predicate in Inverse DP in Italian (7), regardless of the pre-verbal or – should be disallowed (§2.1). post-verbal position, while they invariably agree In order to test these hypotheses, we performed with the preverbal DP in English (8): (i) a Self-Paced Reading (SPR) experiment with a Sentence Comprehension Task at the end, and (ii) (7) a. le foto sono/*è la causa an Acceptability Judgement Task (AJT). the pictures are /*is the cause b. la causa sono/*è le foto 3.1 Material and methods the cause are/*is the pictures Italian In both the SPR and AJT the set of stimuli was the (8) a. the pictures are/*is the cause. same: 128 items (divided in 4 conditions) and 40 b. the cause *are/is the pictures fillers, in SPR, and 60 fillers, in AJT per condition English (72 items per experiment in SPR, 92 in AJT). The 2x2 design produced four experimental Extraction is only allowed from the post-verbal conditions, exemplified in (11): DP – the predicate – in Canonical sentences (9), whereas it is not allowed from the post-verbal DP (11) Condition 1: – the subject – in Inverse copular sentences (10). Canonical + Extraction from the Subject *[PP Di quale muro]i … [DP le foto _i]a sono [SC [_a] (9) a. which rioti do you think a picture of the Of which wall the pictures are wall was the cause of _i? [DP la causa [PP della rivolta]]]? the cause of_the riot? b. di quale rivoltai pensi che una foto del of which rioti do you think that a picture of_the muro sia la causa _i? Condition 2: wall is the cause _i? Canonical + Extraction from the Predicate [PPDi quale rivolta]k … [DP le foto [PP del muro]]a (10) a. *which walli do you think a cause of the Of which riot the pictures of_the wall riot was a picture of _i? sono [SC [ _a] [la causa _k]] are the cause? b. *di quale muroi pensi che la causa della of which walli you think that the cause of_the rivolta sia una foto _i? Condition 3: riot is a picture _i? Inverse + Extraction from the Subject *[PP Di quale muro]i…[la causa [PP della rivolta]]b 3. Experimental evidence supporting the Of which wall the cause of_the riot analysis of copular sentences sono [SC [le foto _i] [ _b]]? are (=is) the pictures? Before considering the computational side or the proposed structural analysis we investigated Condition 4: whether the human parser is sensitive to the Inverse + Extraction from the Predicate critical distinctions illustrated here. Two *[PP Di quale rivolta]k … [la causa _k ]b sono [SC experiments are discussed, testing the processing Of which riot … the cause are (=is) of Canonical vs Inverse copular sentences (first [DP le foto [PP del muro]] [ _b]]? condition) involving the extraction of a wh- the pictures of_the wall element from a DP embedded either under the 3.2 Self-Paced Reading subject or the predicate (second condition). 32 native Italian speakers participated in the Our prediction was that the sensitivity to experiment. Stimuli were composed by questions agreement and to the argumental vs. predicative and by their answers; participants had to read the role distinction for the two DPs involved would question word by word and, then, the answer. have influenced both the online and the offline Finally, they had to judge the appropriateness of performance of native speakers: participants the answer. should show an advantage in parsing Canonical copular sentences (vs. Inverse ones), since only 3.3 Results Participants had to rate the acceptability of questions on a scale from 1 to 7. Participants showed higher accuracy in answering to comprehension questions when the extraction 3.5 Results occurred from the post-verbal DP in Canonical copular sentences – DP predicate in Condition 2 The results (fig.2) confirm the previous on-line – than in Inverse copular sentences – DP subject findings and show that (i) Canonical constructions in Condition 3 – while extraction from the Inverse were more acceptable than Inverse ones and that copular constructions induced lower accuracy (ii) among the different types of copular (-0.41, z=‐2.054, p=0.04; Fig. 1). This confirms sentences, the ones with an extraction from that the structural asymmetry between referential predicates have higher rates than the ones with subjects and predicative DPs has a central role in extraction from subjects. both the processing and the comprehension of nominal copular sentences. Similarly, Inverse vs Canonical opposition seems relevant since extractions from both sites in the Inverse copular constructions produce lower accurate answers compared to the extraction from the predicate in canonical copulars (coherently with Moro 1997, 2006 that predict the DP in both inverse constructions to be illegal extraction sites). Fig.2 Acceptance rates across conditions. 4. Parsing copular sentences To evaluate the state-of-the-art of NLP with respect to the contrasts we discussed (Canonical vs Inverse copular sentences) in a configuration where overt agreement disambiguates the critical roles (predicate vs subject), we ran few tests using the following tools: 1. UDpipe (Straka et al 2016) 2. Stanford Parser - English (Chen & Manning 2014) 3. LinguA parser (Attardi, Dell’Orletta 2009) 4. Google Translate (translate.google.com) 5. Google Cloud AI Solutions Fig.1 Percentage of correct answers across conditions. (cloud.google.com) Reading times, on the other hand, revealed a clear We first tested standard Canonical (3.a) and difference at the copular region for the two Inverse (3.b) copular constructions, then we tried conditions (t=3.37 p=0.002) suggesting a penalty to assess qualitatively the output analyses for the Inverse copular constructions compared to provided by these tools with respect to sub- the Canonical one. Also at the first DP region the extraction from the predicate in Canonical Predicate vs Subject distintion is productively sentences (9.a-b), here repeated for convenience: differentialed (t>2 p=0.008) indicating the la causa (“the cause”) and “le foto” (“the pictures”) (3) a. le foto del muro sono la causa della rivolta conditions, respectively predicate and subject the pictures of the wall are the cause of the riot condition, are perceived as different. b. la causa della rivolta sono le foto del muro 3.4 Acceptability Judgement Task the cause of the riot are the pictures of-the wall the cause of the riot is the pictures of the wall 40 native Italian speakers participated in the experiment. Stimuli were the same than in SPR. (9) a. which rioti do you think a picture of the as its subject). Unfortunately, the same analysis is wall was the cause of _i? proposed for inverse copular constructions (14.b). b. di quale rivoltai pensi che una foto del muro sia la causa _i? (14) a. Canonical copular sentence analysis of which rioti do you think that a picture of the wall is the cause _i? 4.1 UDpipe UDPipe Natural Language Processing - Text b. Inverse copular sentence analysis Annotation interface (Wijffels 2018, Straka et al 2016) provides a handy tool easily integrated in the R environment. Various pre-trained models are available for many languages. We run our analyses using the pre-trained model italian-isdt- The quality of the analysis for the sub-extraction ud-2.4-190531. The results of the analysis for case confirms every suspicion: the sub-extracted both Canonical (10.a) and Inverse (10.b) are wh-item (which riot) is wrongly associated to the simply the same. In fact, not even the basic local matrix predicate (think) (15). dependencies are fully recovered (e.g. det-noun). The analysis of the sub-extraction from predicate (15) sub-extraction from predicate in Canonical in Canonical structures (13.a) is paradoxically less configuration disastrous than the other analyses, but if we try to analyze sub-extraction from the subject of a Canonical construction, we obtain wrong analyses (13.b) (the wh- items is considered an extra 4.3 LinguA argument of cause): LinguA annotation pipeline (service provided on- (12) a. Canonical copular sentence analysis line by ItaliaNLP Lab at Istituto di Linguistica Computazionale "Antonio Zampolli" ILC in Pisa) has been used for our tests on Italian, implementing a version of Attardi & Dell’Orletta b. Inverse copular sentence analysis (2009) parser (currently the state-of-the-art parser for Italian). The analyses of this parser are definitely more precise than the ones proposed by the UDpipe tool, but the symmetric results (13) a. sub-extraction from predicate in Canonical returned for both Canonical and Inverse copular configuration sentences did not identify either the dependency between the predicate and the subject or their actual role in the structure (16.a-b). The analysis of the extraction, interestingly attempts an b. sub-extraction from subject in Canonical interpretation of the wh- item as an (extra) configuration argument of the first DP (le foto [di quale rivolta] (del muro)). This is a wrong analysis, but it is coherent with the slow-down observed in self- paced reading experiment (§3.3) at the first DP region, though the parser does not make the 4.2 Stanford Parser relevant distinction between subject (17.a) and predicate (17.b) (in this second case, sub- Stanford parser (Chen & Manning 2014) can be extraction is interpreted as a copula argument). considered the state-of-the-art parser for English. Canonical constructions, in fact, gave the opportunity to live up to expectations: the analysis of the canonical copular sentence (14.a) is perfectly in line with the analysis presented in §2- §2.1 (cause is identified as predicate and pictures (16) a. Canonical copular sentence analysis b. Inverse copular sentence analysis Fig.4 The structural analysis of the Canonical sentence ‘le intercettazioni sono la documentazione’ (‘The interceptions are the documentation’) given by Google Natural Language. (17) a. sub-extraction from predicate in Canonical configuration b. sub-extraction from subject in Inverse configuration Fig.5 Structural analysis of the Inverse copular sentence ‘la documentazione sono le intercettazioni’ (lett. the documentation are the interceptions; ‘The documentation is the interceptions’) given by Google Natural Language. 4.4 Google AI 4.4 Google Translate We finally investigated the Natural Language In order to evaluate the impact of these wrong service – one of the tools provided by Google analyses on a practical NLP task, we finally Cloud AI Solutions API – which returns syntactic carried out our conclusive experiments on one of representations of sentences the most famous and largely exploited machine (https://cloud.google.com/natural-language/). translation software: Google Translate. While both canonical and inverse copular Starting with simple examples, we observed analyses are equivalent in English to the ones that when the tool is provided with the Italian provided by the Stanford Parser (hence partially Inverse copular sentence ‘La causa della rivolta consistent with our analyses), in Italian, using the sono le foto del muro’ (lett. the cause of the riot Canonical copular sentence ‘le intercettazionik are the pictures of the wall; ‘The cause of the riot sonok la documentazionei’ (‘the interceptions are is the pictures of the wall’), it gives the wrong the documentation’), the tool incorrectly analyses English translation ‘*The cause of the uprising the predicate DP the documentation as an attribute are the photos of the wall’ (Fig.6), in which the (fig. 4) (this might be a consistent annotation of verb does not agree with the pre-verbal DP “the all nominal predicates Google adopted, but it is cause of the uprising”, contrary to what it does in clearly misleading here). Moreover, when it is English (as we saw in 7). provided with the Inverse form of the sentence ‘la documentazione sono le intercettazioni’ (lett. the documentation are the interceptions; ‘The documentation is the interceptions’), the tool incorrectly analyzes the raised predicative DP the documentation – singular noun – as the subject, putting it in a wrong agreement relation with the verb (plural form) (Fig. 5). Then, in the end, this parser fails in recognizing the critical difference Fig.6 Example from Google translate: between Canonical and Inverse copular sentences https://translate.google.it/?hl=it#view=home&op=tran giving exactly the same analysis for both cases slate&sl=auto&tl=en&text=La%20causa%20della%2 (3.a) and (3.b). 0rivolta%20sono%20le%20foto%20del%20muro Interestingly, reversing the translation from exception of the Stanford Parser for English that English to Italian the cause of the riot is the at least succeeded in analyzing correctly the pictures of the wall the system correctly produces canonical copular sentences. This analysis was la causa della rivolta sono le immagini del muro however insufficient in the case of inverse where proper agreement (with the post-verbal constructions and in case of sub-extraction, subject) is in place. Since the analysis provided by confirming that non-local dependencies are any tool we tested is theoretically inconsistent critical configurations native speakers are able to with this result, we hypothesized that this parse but machine do not, yet. translation could have been obtained adopting an example-based approach; it was worth then to test Reference if the correct agreement with the post-verbal Attardi G., Dell’Orletta F. (2009). Reverse Revision subject is just an accident (this is a well know and Linear Tree Combination for Dependency prototypical sentence, widely discussed in Parsing“. In: NAACL-HLT 2009 – North American literature and it might have been included in the Chapter of the Association for Computational Google Translate training set) or if the analysis is Linguistics – Human Language Technologies generalized of any possible subject/predicate pair. (Boulder, Colorado, June 2009). Proceedings, Association for Computational Linguistics, 2009. A sentence like la documentazione sono le pp. 261 – 264. intercettazioni (lett. the documentation are the interceptions, that means ‘The documentation is Chen D., C. D. Manning. (2014). A Fast and Accurate the interceptions’) would suit our purpose nicely. Dependency Parser using Neural Networks. Proceedings of EMNLP 2014. pp. 740-750 In the English > Italian direction the correct singular copular agreement is produced (“the Chomsky, N., (2013). ‘Problems of projection.’ Lingua documentation is the interceptions”) but from 130:33–49 Italian to English this time the wrong agreement Chowdhury, S. A., & Zamparelli, R. (2018, August). is obtained, totally ignoring the number of the real ‘RNN simulations of grammaticality judgments on post-verbal subject (the documentation is the long-distance dependencies.’ In Proceedings of the interceptions > la documentazione è le 27th International Conference on Computational intercettazioni). We concluded then that no deep Linguistics (pp. 133-144). analysis is attempted so as to distinguish between Jespersen, 0., (1924) The Philosophy of Grammar, subject and predicate roles and this turns out to be Allen & Unwin, London. fatal. Moro, A., (1991). The raising of predicates: copula, expletives and existence. MIT Working Papers in 5. Conclusion Linguistics 15: 119-181. In this paper we demonstrated that nominal Moro, A., (1997). The Raising of Predicates. copular sentences constitute a clear challenge for Cambridge: Cambridge UP the computational analysis since the same string Moro, A., (2000). Dynamic Antisymmetry. Linguistic of elements [DP V DP] can have in principle two Inquiry Monograph, Series, MIT Press different syntactic representations (hence two different meanings), depending on which kind of Moro, A., (2006). ‘Copular sentences.’ In Everaert, M. copular sentence is realized (Canonical or & H. van Riemsdijk (eds.), MA. Blackwell Inverse). In this paper, we spotted various glitches Companion to Syntax II, Blackwell, Oxford, 1-23. in the automatic analyses which in the end led Moro, A., (2009). ‘Rethinking Symmetry: A Note on either to significant failures (Google Translate) or Labelling and the EPP.’ In La grammatica tra storia to rough structural hypotheses that bluntly ignore e teoria: Scritti in onore di Giorgio Graffi, edited by the relevant contrasts here discussed. Our P. Cotticelli Kurras and A. Tomaselli, 129–31. empirical study, testing both online and offline the Alessandria: Edizioni dell’Orso; also at http://www.ledonline.it/snippets/allegati/snippets19 wh- PP sub-extraction possibilities from both 007.pdf. subject and predicate DPs, shows that native speakers are sensitive with respect to the different Moro, A., (2018). ‘Copular sentences.’ In Everaert, M. structural roles; in addition, they perceive as & H. van Riemsdijk (eds.), MA. Blackwell expected the underlying structural representation Companion to Syntax, Revised edition vol. II, Blackwell, Oxford, 1-23. of Canonical vs. Inverse copular construction. None of the NLP tools we tested succeeded in providing a full set of coherent analyses, with the Rizzi, L., (2016). ‘Labeling, maximality, and the head- phrase distinction.’ The Linguistic Review 33, 103– 127. Straka, M., Hajic, J., & Straková, J. (2016). UDPipe: trainable pipeline for processing CoNLL-U files performing tokenization, morphological analysis, pos tagging and parsing. In Proceedings of the tenth international conference on language resources and evaluation (LREC 2016) (pp. 4290-4297). Stowell, T., (1978). ‘What was there before there was there.’ In D. Farkas et al., eds., Papers from the Fourteenth Regional Meeting, Chicago Linguistic Society. Chicago Linguistic Society, University of Chicago. Wijffels, J. (2018). udpipe: Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the ‚UDPipe ‘‚NLP ‘Toolkit. R package version 0.5.