ITAT 2016 Proceedings, CEUR Workshop Proceedings Vol. 1649, pp. 68–73 http://ceur-ws.org/Vol-1649, Series ISSN 1613-0073, c 2016 K. Přikrylová, V. Kuboň, K. Veselovská Logical vs. Natural Language Conjunctions in Czech: A Comparative Study Katrin Přikrylová, Vladislav Kuboň, and Kateřina Veselovská Charles University in Prague, Faculty of Mathematics and Physics Czech Republic {prikrylova,vk,veselovska}@ufal.mff.cuni.cz Abstract: This paper studies the relationship between con- ceptions and irregularities which do not abide the rules as junctions in a natural language (Czech) and their logical strictly as it is the case in logic. counterparts. It shows that the process of transformation Primarily due to this difference, the transformation of of a natural language expression into its logical representa- natural language sentences into their logical representation tion is not straightforward. The paper concentrates on the constitutes a complex issue. As we are going to show in most frequently used logical conjunctions, ∧ and ∨, and it the subsequent sections, there are no simple rules which analyzes the natural language phenomena which influence would allow automation of the process – the majority of their transformation into logical conjunction and disjunc- problematic cases requires an individual approach. tion. The phenomena discussed in the paper are temporal In the following text we are going to restrict our ob- sequence, expressions describing mutual relationship and servations to the two most frequently used conjunctions, the consequences of using plural. namely a (and) and nebo (or). 1 Introduction and motivation 2 Sentences containing the conjunction a (and) The endeavor to express natural language sentences in the form of logical expressions is probably as old as logic it- The initial assumption about complex sentences contain- self. A very important role in this process is being played ing the conjunction a (and) is inspired by the properties by natural language conjunctions and their transformation of the corresponding logical connective and – we suppose into logical connectives. The conjunctions are much more that the two clauses connected by the conjunction express ambiguous than logical connectives and thus it is neces- two situations which are valid at the same time. This really sary to analyze their role in natural language sentences, in holds in a number of complex sentences, as for example various contexts and types of texts. This paper presents here: a step towards such analysis for one particular language - Lev je kočkovitá šelma a žije v Africe. Czech. (1) (Lion is a feline and it lives in Africa.) Let us recall that the fundamental task of logic is to set rules and methods for inferencing and referencing. On the other hand, natural languages serve primarily for commu- Jana je ve škole a Honza leží nemocný v posteli. (2) nication. Speakers can reach and agreement or understand (Jana is in a school and Honza lies ill in bed.) each other even without a strict adherence to preset rules In the sentence 1 we have used the so called gnomic (regardless whether they are morphological, grammatical present1 . The truth value of the whole sentence is TRUE or stylistical). A human brain can obtain substantial in- only in case that both clauses are TRUE, regardless of the formation also from ill-formed sentences what actually context or current situation. makes them to fulfill their main goal, to serve as a tool Complex sentences with gnomic present constitute for communication. On the other hand, Noam Chomsky probably the simplest case. It is not necessary to investi- introduced in [1] also a famous example Colourless green gate whether the clauses are true or what are the conditions ideas sleep furiously – a grammatically well-formed sen- under which they might become true – they are simply true tence which does not have any meaning and thus it cannot either always or never. Such sentences can be transformed serve the communication task. into a logical representation in a simple and straightfor- Sentences in any natural languge are not isolated, their ward manner.2 In the logical representation of the exam- meaning typically depends on the context in which they ple above we would of course use the construction A ∧ B appear, on the way how they are pronounced or even on for the conjunction a (and). some external factors as, e.g., gestures which accompany 1 Present tense can be used also for the so called extratensal pro- it. Natural languages also evolve in time according to the cesses which are valid always, regardless of the current situation. In our needs of the language community and although each nat- example we describe properties of an animal species and its habitat. ural language has a set of generally applicable rules (syn- 2 Let us point out that mathematical theorems typically contain tactic, stylistic, morphological etc.), there are many ex- gnomic present. Logical vs. Natural Language Conjunctions in Czech: A Comparative Study 69 The sentence 2 describes two situations being TRUE ex- Or, we could drop the initial part which we may con- actly in this moment3 . The truth value of both sentences sider to be implicitly present: can be determined by a reference from the language to a real–world, where we will find out whether both clauses Jana je ve škole, a současně Honza leží describe a valid situation.4 nemocný v posteli. (6) None of the two clauses from the sentence 2 is abso- (Jana is in the school, and, at the same lutely true (Jana does not spend every minute of her life in time, Honza is lying ill in bed.) the school and Honza is not ill forever). However, when we utter any of these two statements, we do not mean that Into such template it is possible to insert also the com- Jana should stay all the time in the school. The use of the plex sentence introduced above: present tense implicitly carries the information that she is there just now, in this moment. If we accept that in order Jana byla včera ve škole a současně Honza to determine the truth value, we have to look into the real včera ležel nemocný v posteli. (7) world and also take into the account the time when the sen- (Jana was in a school yesterday and, at the same tence was uttered, we could paraphrase the sentence into a time, Honza was lying ill in bed yesterday.) more unambiguous variant for example like this: All the complex sentences mentioned above can Jana je právě ted’ ve škole a Honza nyní leží schematically be described in the form A ∧ B. The fact nemocný v posteli. that we can express the mutual relationship of clauses by (3) (Jana is just now in the school and Honza is means of a logical scheme actually means that we can now lying ill in bed.) work with them according to logical rules. For example, logical conjunction is commutative – and we really can – i.e. with the added information about time. Such sen- swap the order of clauses in our complex sentence and still tence would then correspond to the logical scheme of the retain the original truth value. conjunction : A ∧ B.5 Natural languages use, of course, also other tenses – 2.1 Violation of a temporal sequence what if we would like to express the same content in the past, for example yesterday? Unfortunately, the conjunction a (and) doesn’t appear only in sentences describing actions which are happening in the Jana byla včera ve škole a Honza včera ležel same moment. All of the following sentences contain a nemocný v posteli. (4) (and) as its main conjunction: (Jana was in a school yesterday and Honza was lying ill in bed yesterday.) Honza spadl a zlomil si ruku. (8) (Honza fell and broke his arm.) On the first sight, there seems to be no substantial problem. The only difference seems to be in the fact that we are not Jana odemkla a vešla do bytu. referring to a current moment, but to the moment in the (9) (Jana unlocked and entered the flat.) past (in this case, yesterday). However, what if Honza will recover till the next day, will such sentence have the same Šli jsme na výstavu a potom do kina. truth value also tomorrow? (We have visited an exhibition and (10) Regardless to what time the expressions refer to, we are then we went to a cinema.) interested in them only if they are TRUE in this current moment. We should thus simplify our sentence rather in These sentences apparently aren’t commutative. The or- the following way: der of clauses cannot be swapped without affecting the truth value or meaning of the whole sentence. The rea- Právě ted’ platí, že Jana je ve škole, a současně, son is obvious – both clauses are ordered into a temporal že Honza leží nemocný v posteli. sequence. (5) The conjunction a (and) isn’t a logical conjunction in (Just now it is true that Jana is in the school, these sentences, although it fulfills one fundamental basic and, at the same time, Honza is lying ill in bed.) condition – if the whole sentence is supposed to be true, 3 Of course, only if it is true that Jana is just now in the school and then both clauses also have to be true. Honza lies ill in his bed. The propositional logic nevertheless cannot cope with 4 Determining the truth value of natural language expressions is stud- sentences of this kind. We might be tempted to attempt to ied by epistemology, a simple explanation can be found for example in [2]. solve this issue by means of the conditional construction 5 If we would like to consider tiniest details, we would have to con- Když..., (pak) ... (When... (then) ...): sider also the issue of proper names and singular terms – our sentence does not specify which Jana and Honza we are talking about. More on Když Jana odemkla, vešla do bytu. (11) this topic can be found for example in [3]. (When Jana unlocked, then she entered the flat.) 70 K. Přikrylová, V. Kuboň, K. Veselovská and thus to find a certain scheme corresponding to an im- Let us take the sentence: plication. In natural languages, the modified sentence is Jestliže Honza neodevzdá diplomovou práci equivalent with the original one, but this is true only be- včas a nepřihlásí se ke státnicím, studia cause the construction Když..., (pak) ... (When... (then) ...) not necessarily always means an implication. In this par- letos nedokončí. (15) ticular case, its role is more temporal than conditional.6 (If Honza won’t submit the thesis in time Transforming the sentence into the scheme A → B is thus and doesn’t subscribe for the state exams, incorrect. In [4], František Gahér suggests a very simple he won’t finish his studies in this year.) test whether a particular expression containing the con- junction a (and) is a logical conjunction or not. He uses If we would like to preserve the equivalence of a (and) the expression a současně (and at the same time). and a logical conjunction, we could write this sentence The sentence: schematically as (¬A ∧ ¬B) → ¬C. And indeed, the ut- terances corresponding to this scheme can be often heard Jana odemkla a současně vešla do bytu. from the Czech native speakers. If we look at the given (Jana unlocked and at the same time (12) sentence more closely, we will agree that in order to fin- entered the flat.) ish one’s studies it is indeed necessary to finish the the- sis in time and at the same time to subscribe also for the does not make much sense and thus we should not directly state exams. If at least one of these two conditions is not transform it into logical conjunction. However, the author fulfilled, Honza will not finish his studies. The scheme itself admits that such simple test is not 100% reliable – (¬A ∧ ¬B) → ¬C, on the other hand, requires both condi- the construction: tions to be invalid in order to obtain FALSE as the truth Gödel se narodil v roce 1906 a zemřel v roce 1978. value of the whole sentence. (Gödel was born in 1906 and died in 1978.) It would therefore be more correct to describe the com- (13) plex sentence schematically as (¬A ∨ ¬B) → ¬C. The actually has all required properties of a conjunction: conjunction a (and) clearly substitutes logical disjunction both clauses must be true if the whole sentence should be in this context. Actually, even in the natural language it true; their order can be changed7 . However, when we try would be more correct to use the conjunction nebo (or) to replace a (and) by the construction a současně, (and at and to say: the same time), we won’t get a meaningful sentence: Jestliže Honza neodevzdá diplomovou práci Gödel se narodil v roce 1906 a současně včas nebo nepřihlásí se ke státnicím, studia zemřel v roce 1978. letos nedokončí. (14) (16) (Gödel was born in 1906 and at the same (If Honza won’t submit the thesis in time time died in 1978.) or doesn’t subscribe for the state exams, he won’t finish his studies in this year.) Let us now return to the original sentence. We have al- ready mentioned that in predicate logic it is impossible to The fact that this error is quite frequent in natural lan- describe it unless we loose an important information about guage communication is documented for example in the the order of events. What if we would use some other type research of Vlastimil Chytrý [7] conducted among the of logic? The type which seems to be ideally suited for pupils of basic and secondary schools. Only 11,5 % of such kind of constructions is the temporal logic. It is in them were able to correctly negate the conjunction in the fact the propositional logic enriched by the so called tem- antecedent of the implication, when they were asked to poral operators, by means of which we can express a tem- paraphrase it. We can only speculate why the native speak- poral sequence of actions. More information about this ers make this error so often.8 kind of logic can be found, e.g., in [5] or [6]. 2.3 Relation Expression 2.2 Disjunction Let us emphasize that, e.g., a sentence: So far, we have dealt with the conjunction a (and) in the Jana a Honza jsou studenti. cases in which it expressed conjunction. Let us now show (17) (Jana and Honza are students.) that the same natural language conjunction may in some specific cases also serve as a logical disjunction. is actually a compound sentence: Jana je studentka a Honza je student. (18) 6 The conjunction když (when) is then ambiguous. (Jana is a student and Honza is a student.) 7 Although it is more natural to use them in this order. Nevertheless, the variant with the reversed order does not violate neither linguistic rules 8 More about the processes in the center of speech of a human brain nor the logical meaning of the sentence. can be found for example in [8]. Logical vs. Natural Language Conjunctions in Czech: A Comparative Study 71 logic. The following sentences represent other cases in i.e. it expresses two utterances. which the conjunction a (and) refers to a relationship be- In the sentence 18, the conjunction a (and) is equivalent tween the subjects: to a conjunction in logic. However, let us investigate the following examples: Černé a bílé ponožky se pomíchaly. (25) (Black and white socks got mixed.) Jana a Honza jsou přátelé. (19) Jana a Honza se vzali. (Jana and Honza are friends.) (26) (Jana and Honza got married.) Jana a Honza se milují. In the above examples we have shown that if the con- (20) (Jana and Honza love each other.) junction a (and) is used in the utterance which expresses a relationship, it cannot be used as a conjunction in logic. Barma a Myanmar je totéž. (Burma and Myanmar is the same thing)10 (21) 2.4 Problems with plural To rephrase the first sentence as: The method of connecting smaller pieces of text than Jana je přítelkyně a Honza je přítel. the whole compound sentences which we have introduced (22) above can be called a distributive method in the mathemat- (Jana is a friend and Honza is a friend.) ical sense of that term. However, the method is not flaw- makes no sense, since we lose the information about a less and we have already shown the examples for which it relationship between Jana and Honza. cannot be applied. We will now demonstrate the imperfec- In Czech, the second sentence could be rephrased as: tions of the method that are not related only to lexicon/se- mantics (i.e. to particular words which do not let us use the Jana se miluje a Honza se miluje. method due to their lexical meaning), but rather to syntax. (23) (Jana loves herself and Honza loves himself.) Let us consider the following sentence: but it’s meaning is not the same as in case of the origi- Pošt’ák přivezl velký a těžký balík. nal sentence, which is ambiguous in Czech. It contains (27) (A postman delivered a big and heavy package.) a reflexive verb milovat se (to love someone), which ex- presses a relationship either between two subjects or of It is natural to agree with the premise that a postman each of them to him/herself particularly. However, we usu- delivered only one package, which was big and heavy at ally use this verb in situations in which we want to express the same time.11 However, if we divide the sentence into a relationship between two people. Anyway, this exam- two propositions: ple is to illustrate that the same utterance can be formally represented using two different logical schemes. In case Pošt’ák přivezl velký balík a pošt’ák přivezl těžký balík. we would want to express the second meaning, we would (A postman delivered a big package and write it down using the means of predicate logic as a postman delivered a heavy package) (28) love_onesel f (Jana) ∧ love_onesel f (Honza) the most natural interpretation would probably be that a postman delivered two packages, one of them big and the To catch the first meaning, we would have to use not the other one heavy. unary relation, but a binary one The distinction is even more obvious in the following sentence: Love(Jana, Honza), Na ulici stálo modré a zelené auto. which would be in this case symmetrical. Therefore, we (There was a blue and green car parking (29) would have to abandon the propositional logic to describe on the street.) this type of sentences. The last sentence from the list cannot be rephrased as: Although the word car is used here in singular, we would probably say that there were two cars parking on Barma je totéž a Myanmar je totéž. (Burma is the same thing and Myanmar the street, one of them blue and the other one green. In (24) case the author would use plural: is the same thing.) Na ulici stála modrá a zelená auta. This sentence makes no sense, since the phrase to be (There were blue and green cars parking (30) the same thing again implies a relationship between the entities. These examples actually clearly document the on the street.) fact that the conjunction a (and) used in utterances which 11 That would probably be the first interpretation which would come express a relationship cannot be used as a conjunction in to our mind without thinking about any further meanings. 72 K. Přikrylová, V. Kuboň, K. Veselovská we would probably come into conclusion that there were even more than two cars parking on the street. Jako cestovatel se dostal na mnohem zajímavější These examples demonstrate that when connecting two a podivuhodnější místa. adjectives, the interpretation of the conjunction a (and) is (As a traveler, he got to far more not clear. Whereas in the first example it is a description of one object having two characteristics, in the second ex- interesting and remarkable places.) ample we describe two different objects having two dif- (35) ferent characteristics. However, we assign the same activ- ity (same predicate) to both of these objects. The type of 3 Interpretation of the Sentences the structure is given by the particular adjectives. It is not Containing the Conjunction nebo (or) common in a real word that the car would be both blue and green at the same time.12 In the case when it would be a As we have shown in the previous section, the interpreta- dirty and scratched car, it would probably be perceived as tion of the conjunction a (and) is not an easy task. Surpris- only one vehicle. ingly, the conjunction nebo (or) behaves more systemati- More syntactic problems are connected with a plural. cally. While in the sentence: The conjunction nebo (nebo) can be interpreted in two ways: Na ulici stálo špinavé a poškrábané auto. (There was a dirty and scratched car parking (31) • as a disjunction, on the street.) • as an exclusive disjunction. we assign both characteristics to only one object (a car), Apart from English, Czech language has a rather strict in the sentence: rules distinguishing between these two cases.14 If there is nebo (or) following the comma, it is an exclusive dis- Na ulici stála špinavá a poškrábaná auta. junction. In all the other cases, it is considered a common (There were dirty and scratched cars parking (32) disjunction: on the street.) Čertovi se také říká d’ábel nebo satan. we do not insist on assigning both characteristics to all (The demon is also called a devil or a Satan.) (36) of the vehicles. – a disjunction The second sentence thus cannot be interpreted as a con- junction, but rather as a disjunction.13 In the case of a plu- Honza přijede ve středu, nebo až ve čtvrtek. ral we cannot consider this feature as a specific property (Honza is coming on Wednesday, (37) of particular adjectives. This phenomena is not related to a specific semantics of given lexemes, but concerns all the or on Thursday.) – exclusive disjunction adjectives in their fullness. Naturally, in the spoken language we do not have a chance Below we list some other sentences which should be to find out whether there is a comma in the sentence or considered: not.15 Therefore, we have a lexical distinction at our dis- posal: the exclusive nebo (or) becomes a correlative con- Článek se zabývá aktivními a pasivními příjmy. junction, namely bud’–nebo (either–or): (The article discusses the active and passive (33) incomes.) Honza přijede bud’ ve středu, nebo až ve čtvrtek. (Honza is coming either on Wednesday or on Thursday.) – exclusive disjunction Sešli se tam všichni místní slavní a bohatí lidé. (38) (All the local famous and rich people met up (34) Conjunction nebo (or) can also be a part of the more at the event.) complex connection which can be further expressed using other logical conjunction. For illustration, see the follow- ing sentence: At’ už Honza přijede, nebo ne, oslava se bude konat. 12 In this case, the car would probably rather be described as blue- (Whether Honza is coming or not, we will green. throw the party.) 13 However, we still need to take into account the level of which we (39) speak. Whereas in connection with the noun (or a noun phrase) the adjec- 14 In English, we use a comma preceding the conjunction or when it tive is attributed to we talk about disjunction (it is not required for both objects to have both characteristics), in context of the whole sentence the connects two independent sentences, regardless the relationship between conjunction a (and) behaves as a conjunction again. It means that if there them. would be only dirty cars parking on the street, we would be wondering 15 In Czech, we place commas based on structural rules, i.e. not in where are the scratched ones mentioned in the sentence as well. places where there is a natural break in spoken utterance. Logical vs. Natural Language Conjunctions in Czech: A Comparative Study 73 5 Acknowledgments At’ už Honza přijede ve středu, nebo ve čtvrtek, rozhodně navštíví také prarodiče. This research was supported by the grant GA15-06894S of (40) the Grant Agency of the Czech Republic and by the SVV (Whether coming on Wednesday or Thursday, project number 260 224. This work has been using lan- Honza will definitely drop by his grandparents.) guage resources stored and/or distributed by the LINDAT- As for the sentence 39, we can write down the proposi- Clarin project of MŠMT (project LM2010013). tion using a special logical conjunction Maybe and (MA, truth depends on second proposition) described for exam- References ple in [9]: Honza is coming MA we will throw a party. The sentence 40 is however much more complex. Al- [1] Chomsky, N.: Syntactic Structures. Werner Hildebrandt, though it expresses the contrast of the two possibilities, Berlin (2002) one of them is not a negation of another. Therefore we [2] Dummet, M.: Origins of Analytical Philosophy. Harvard cannot use the conjunction MA, since it is a binary con- University Press, Cambrige, Massachusetts (1996) junction and we need to connect three propositions.16 The [3] Marvan, T.: Otázka významu. Togga, Praha (2010) In second sentence can be transformed into a logical notation Czech. in the following way: [4] Gahér, F.: Logika pre každého. Iris, Braatislava (1995) in Slovak. (Honza přijede ve středu ⊕ Honza přijede ve čtvrtek) ∧ [5] Øhrstrom, P., Hasle, P.F.V.: Temporal Logic: From Ancient Honza rozhodně navštíví prarodiče. ((Honza is coming on Ideas to Artificial Intelligence. Kluwer Academic Publish- Wednesday ⊕ Honza is coming on Thursday) ∧ Honza ers, Dorderecht, Netherlands (1995) will drop by his grandparents.) [6] Sag, I., Wiesler, S.: Temporal connectives and logical form. In: Proceedings of the Fifth Annual Meeting of the Berkeley (41) Linguistics Society, Berkeley (1979) 336–349 Therefore, we would have to use the exclusive disjunction [7] Chytrý, V.: Logika, hry a myšlení. Univerzita J. E. Purkyně, Ústí nad Labem (2015) in Czech. again.17 [8] C.L.Baker, McCarthy, J.J.: The Logical Problem of Lan- guage Acquisition. The MIT Press, Cambridge, Mas- 4 Conclusion sachusetts (1981) [9] van Wijk, M.: Logical connectives in natural language. In this article, we have discussed the interpretation of nat- Doctoraalscriptie Algemene Taalwetenschap. Universiteit ural language sentences using the means of logic. We Leiden Faculteit der Letteren, Leiden (2006) Available have shown that although some of the logical conjunc- online: https://www.era.lib.ed.ac.uk/bitstream/handle/1842 tion names are motivated by the natural language conjunc- /5822/VanWijk2006.pdf. tions and they quite often have similar meaning, it is not possible to translate them from natural language to logic directly. Especially for the conjunction a (and) we have introduced more complex problems (i.e. the issue of rela- tions, plurals or sequence of tenses) which prevent us from identifying a (and) with a logical conjunction. Also, we have brought an important analysis of the possibilities (and problems) which have to be considered when working beyond sentential level. We have shown how to transform these structures so that they could be de- scribed using the means of the propositional logic (which takes only the propositions – or, in other words, sen- tences). 16 1. Honza is coming on Wednesday. 2. Honza is coming on Thurs- day. 3. Honza will drop by his grandparents. 17 Conjunction MA can be also expressed using a set of conjunctions {∧, ∨, ¬}. It is also interesting to consider whether the construction At’ už (...), nebo (...) (Whether (...) or (...)) can be captured using a common disjunction or using the exclusive one. In case we are describing an in- disputable system (such as propositional logic), we already know that the situation A ∧ ¬A is impossible, so both versions of the translation – with ∨ and with ⊕ – are equivalent if there is the same formula in the connec- tion (or, more precisely, the formula and its negation). Finally, we have to mention that there were both variants (with and without a comma) found in the corpus.