Automated Identification of Metaphors in Annotated Corpus (Based on Substance Terms) Olena Levchenko, Oleh Tyshchenko, and Marianna Dilai Lviv Polytechnic National University, Bandera Str., 12, Lviv, 79000, Ukraine Abstract The automatic or automated metaphor identification remains a challenging problem. The methods proposed so far have been mostly developed for the English language and can be roughly divided into two groups: intended for annotated and non-annotated corpora. In addition, neural networks are used. It should also be noted that the application of recently developed methods for measuring the degree of semantic association of collocation components (T-score, MI, logDice, etc.) fails to detect metaphorical expressions. Previously, we presented a method of automated identification of metaphorical expressions (adjective + noun) for non-annotated corpora of Ukrainian prose texts, based on the analysis of dictionary definitions. This paper describes a method of automated identification of metaphors in the semantically annotated corpus of texts. This algorithm is based on the theoretical propositions and readings of metaphor within the framework of Conceptual Metaphor Theory. The methodology contains an empirical stage at which structural-semantic models of metaphors are detected and classified based on the semantic category of the words in the right-hand position. The performance analysis and the evaluation of the method’s effectiveness are presented. Keywords 1 Metaphor, annotated corpus, substance nouns, automated identification of metaphor 1. Introduction The automatic/automated metaphor identification still remains a challenging problem. The methods introduced so far have been mostly developed for the English language and divided into the methods designed for semantically annotated, metaphorically annotated and non-annotated corpora. A detailed analysis of the approaches used today is presented in [1, 2, 3, 4] and others. It should be noted that different methods of automated metaphor identification are based on different theoretical readings of metaphor; however, the most modern approaches are grounded on the Conceptual Metaphor Theory [5, 6]. Given various interpretations of metaphor, researchers use different terminology: ‘promising metaphorical words’ [1]; aspect words, abstractness of the aspect words [7, 8] and others. VUAMC corpus is an example of a metaphorically annotated corpus of the English language, which is annotated applying the MIPVU methodology (Metaphor Identification Procedure Vrije Universiteit) [9]. This technique includes revealing the basic meaning of the word and then determining the degree of contrast between the basic and contextual meanings. To avoid subjectivism, two or more annotators are involved in this procedure and are to reach an agreed decision [9]. Previously, we developed a method of automated identification of metaphorical expressions (adjective + noun) for non-annotated corpora of Ukrainian prose texts, based on the analysis of dictionary definitions [10]. It has been successfully applied in a number of studies [11, 12], which COLINS-2021: 5th International Conference on Computational Linguistics and Intelligent Systems, April 22–23, 2021, Kharkiv, Ukraine EMAIL: levchenko.olena@gmail.com (Olena Levchenko); olkotiszczenko@gmail.com (Oleh Tyshchenko); mariannadilai@gmail.com (Marianna Dilai) ORCID: 0000-0002-7395-3772 (Olena Levchenko); 0000–0001–7255–2742 (Oleh Tyshchenko); 0000-0001-5182-9220 (Marianna Dilai) © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) proved the effectiveness of the proposed method. In addition, the Master's thesis, carried out under our supervision, describes structural-semantic models of zoomorphic metaphors in the semantically non-annotated corpus of prose texts [13]. This paper presents a method for automated identification of metaphors based on substance terms in the semantically annotated corpus of texts. It should be noted that there are a number of principles of semantic annotation of corpora [14]. The proposed method is based on theoretical provisions about metaphor within the cognitive theory of metaphor. The analysis was conducted using GRAC corpus [15]. Since GRAC V10 contains semantic annotation of only the most frequent words, we made an attempt to semantically classify the collocates in order to test the hypothesis, using the principles of semantic classification applied in the RNC [16]. 2. Theoretical linguistic background In the ‘precognitive’ period G. Sklyarevska identified the following types of regular metaphorical transference (cognitivists instead of the term metaphorical transference mostly use the term blending of conceptual domains): 1) OBJECT → OBJECT; 2) OBJECT → HUMAN BEING; 3) OBJECT → PHYSICAL WORLD; 4) OBJECT → MENTAL WORLD; 5) OBJECT → ABSTRACTION; 6) ANIMAL → HUMAN BEING; 7) HUMAN BEING → HUMAN BEING; 8) PHYSICAL WORLD → MENTAL WORLD [17] (see Fig. 1). OBJECT HUMAN PHYSICAL MENTAL ABSTRACTION HUMAN HUMAN MENTAL BEING WORLD WORLD BEING BEING WORLD ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ OBJECT OBJECT OBJECT OBJECT OBJECT ANIMAL HUMAN PHYSICAL BEING WORLD Figure 1: Types of regular metaphorical transference It has been revealed that conceptualization of mental states is characterized by personification, perception of emotions as liquids, fire, light, elements (verbalization by the terms of these concepts) [18]. O. Levchenko's phraseological study presents a range of metaphors within the verbalization of mental states: STATE IS FORCE (FIRE, COLD (FROST), WIND, FLUID (WATER), LIVING BEING (BEAST, RIVAL), LOCUS, CONTAINER; NORMAL STATE IS BALANCE, BODY IS CONTAINER OF STATES [19]. Thus, formalizing metaphorical models, we can speak of a ‘metaphorical hierarchy’ (which is somewhat different from the semantic hierarchy of features): mental states (feelings, cognition, speech) are metaphorized in terms of physiological states (life, death, body temperature, hunger), personification (a living being – a human being, an animal), elements, flora or substance, but not vice versa; physiological states are metaphorized by personification (a living being – a human being, an animal), elements, flora or substance. The research hypothesis is that the combination, in our case, of nouns belonging to different levels of hierarchy in terms of their semantics should indicate the formation of a metaphor (the greater the interconceptual distance, the more likely it is a metaphor). Substance nouns are at a lower level of the hierarchy than nouns belonging to semantic categories/thematic classes r:abstr, t:be, t:be:exist, t:ment, t:humq, t:behav, t:psych, t:psych:emot, t:speech, etc. In terms of substance semantics, a number of mental states are metaphorized (see Table 1). We can state that the metaphor ANGER IS POISON is universal, it has various manifestations in the Slavic languages: Ukr. псувати/зіпсувати кров ‘spoil/have spoiled one’s blood’ 1) ‘to be nervous, irritated’; 2) ‘to cause someone a lot of trouble; to make someone nervous, to irritate someone’ [20]; Bulg. тровя (отравям/отровя) кръвта на някого ‘poison one's blood’ [21]; Pol. psuć komuś krew, napsuć krwi ‘wprawiać w zły humor; drażnić, dokuczać ‘to spoil one's blood, to put someone in a bad mood; tease’ [22], although in the above examples we do not observe the use of the actual term poison, in contrast to: Bulg. държи ме ядът ‘anger holds me’ [23], where poison is ‘strong irritation, malice, anger’; изкарвам/изкарам (изливам/излея) си яда ‘pour out your anger’; яд ме е ‘I'm angry’ [23]. Table 1 Blended conceptual domains during the metaphorization of mental states A mental state different from "normal" ↕ ↕ ↕ ↕ ↕ ↕ Living being physiological force/element substance locus container states (usually fluid) animal, rival hunger, light, fire, cold, poison, body as a sole, heart temperature wind, water balm, nectar, container honey J. Dunn concludes that metaphor is a gradient phenomenon: certain metaphorical expressions are characterized by a higher degree of metaphoricity, others are less metaphorical. The researcher is convinced that the problem of different systems of metaphor identification lies in the fact that such systems are based on binary division, and the phenomenon in question is of a gradient nature [24]. Therefore, he proposes to determine the degree of metaphoricity of the expression depending on its metaphorical saturation [24]. The degree of metaphoricity is a very subjective feature, its identification directly depends on the experience of the annotator. Interestingly, many works on metaphor discuss a problem of classification of the phenomenon and assessment of its metaphoricity. For example, Yu. Badryzlova notes: “in some cases, it is problematic to unambiguously classify a meaning as a basic meaning or a non-basic meaning” [3]. However, we entirely agree that metaphor is a complex and diverse phenomenon. The abovementioned properties are manifested in different types of metaphors (structural – word, phrase, sentence, text (a metaphor that grows into an allegory within the text); conceptual – associated with different ways of conceptualization; perception by ‘degree of metaphoricity’, etc.). We also agree that different procedures should be used to identify different types of metaphors. ‘Traditional’ linguistic analysis, in particular within cognitive linguistics, consists of three stages, namely identification, interpretation and explanation [25]. J. Charteris-Black claims that “metaphor identification is initially concerned with ideational meaning – that is, identifying whether they are present in a text and establishing whether there is a tension between a literal source domain and a metaphoric target domain. Metaphor interpretation is concerned with interpersonal meaning – that is, identifying the type of social relations that are constructed through them. Metaphor explanation is concerned with textual meaning: that is, the way that metaphors are interrelated and become coherent with reference to the situation in which they occur” [26]. The data necessary for the automated metaphor identification are obtained at the stage of interpretation (when the blended conceptual domains get formalized description in the form of classification into semantic categories/ thematic classes). In this paper at the stage of ‘human’ identification of the metaphor we view collocations of the formal-grammatical model SUBSTANCE NOUN (ОТРУТА, БАЛЬЗАМ, НЕКТАР, МЕД ‘POISON, BALM, NECTAR, HONEY’) + NOUN. The second component of collocation is classified based on a particular semantic category/thematic class it belongs to. Previously, a comparative technique was proposed by Turney [6], but the analysis of ‘the mixture of abstract and concrete words’ was carried out within the sentence. In linguistics the metaphors of the model in question are called genitive metaphors (because the second noun is mostly in the genitive case) or metaphors-comparisons. However, it is necessary to take into account another property of such metaphors. In the course of metaphorization, we can observe a complete and partial transfer of terms at the verbal level from blended conceptual domains. For example, cf. баламутити голови кому ‘to muddy one's head’ 1) ‘to fool someone’, 2) ‘to incite someone to bad actions, deeds’ [20]; ловитися/впійматися на гачок (на вудочку) ‘to be caught on a hook’ (on a fishing rod) [20], закидати/закинути вудку (гачок, гака) ‘to throw a fishing rod (hook)’ [20]. The phraseological unit includes the component голова ‘head’, which belongs to the conceptual domain HUMAN BEING, and баламутити ‘to muddy’ belongs to the conceptual domain WATER. In the examples ловитися/впійматися на гачок (на вудочку, закидати/закинути вудку (гачок, гака) ‘catch on a hook (on a fishing rod), throw a fishing rod (hook)’, the components belong to the conceptual source domain FISHERY [27]. Thus, the metaphors of the model SUBSTANCE NOUN + NOUN contain terms of two conceptual domains, which provides explicit markers for automatic identification of the minimal metaphorical context. In fact, the substance noun metaphorizes the second component in the collocation. 3. Results and discussion Thus, based on the corpus data, we have identified 643 contexts with the component отрута ‘poison’, in particular in 415 cases the word is used literally and in 228 contexts it is used metaphorically. It should be noted that words/collocations in quotation marks can be considered metaphorical if there are no markers of intertextuality. We have singled out the following cases of literal use (see Table 2). Table 2 Non-metaphorical collocations Lexemes Semantic category/ Frequency Result thematic class бацила ‘bacillus’ 1, вівчарка ‘sheep dog’1, t:animal r:concr 104 + гюрза ‘viper’4, оса ‘wasp’ 2, плазун ‘reptile’ 1, собака ‘dog’ 2, тарантул ‘tarantula’ 1, кобра ‘cobra’ 14, медуза ‘jellyfish’ 1, павук ‘spider’ 3, риба-пузан ‘puzan fish’ 1, тварина ‘animal’ 3, фугу ‘fugu’ 2, щитомордник ‘pit viper’ 2, бджола ‘bee’ 6, гадюка ‘viper’ 20, змія ‘snake’ 21, істота ‘creature’ 1, каракурт ‘karakurt’ 1, комаха ‘insect’ 10, саламандра ‘salamander’ 1, скорпіон ‘scorpion’ 4, хабу ‘habu’ 1, шершень ‘hornet’ 1 Я ‘I’, ти ‘you’, ми ‘we’, він ‘he’, вона ‘she’ 69 + банкір ‘banker’ 1, брат ‘brother’ 1, t:hum r:concr 67 + диктатор ‘dictator’ 1, дружина ‘wife’ 1, дуелянт ‘duelist’ 1, імператор ‘emperor’ 1, Клод ‘Claude’ 1, Литвиненко ‘Lytvynenko’ 1, лях ‘Pole’ 1, маля ‘baby’ 1, Марія ‘Maria’ 1, медсестра ‘nurse’ 1, Млада ‘Mlada’ 1, Надія ‘Nadiya’ 1, наркоділок ‘drug dealer’ 2, Нерон ‘Nero’ 1, отаман ‘ataman’ 1, Ребека ‘Rebecca’ 1, султанша ‘sultana’ 1, сусідка ‘neighbor’ 1, Хруст ‘Khrust’ 1, цариця ‘queen’ 1, Анна ‘Anna’ 1, аспірант ‘graduate student’ 1, вояк ‘soldier’ 1, діти ‘children’ 5, жінка ‘woman’ 1, лідер ‘leader’ 1, пан ‘lord’ 1, піп-грек ‘Greek priest’ 1, Сальєрі ‘Salieri’ 1, селюк ‘peasant’ 1, син ‘son’ 2, слуга ‘servant’ 1, коломийка ‘kolomyyka’ 1, Несса ‘Nessa’ 1, волхв ‘magician’ 1, гер ‘herr’ 1, Гофман ‘Hoffman’ 1, злочинець ‘criminal’ 1, Калиновський ‘Kalynovsky’ 1, Клаус ‘Klaus’ 1, князь ‘prince’ 2, колега ‘colleague’ 1, лікар ‘doctor’ 2, пацієнт ‘patient’ 2, Пилип'юк ‘Pylypyuk’ 1, Серафима ‘Seraphim’ 1, спеціалісти ‘specialists’ 1, хазяїн ‘owner’ 1, людина ‘person’ 3, мисливець-убивця ‘killer hunter’ 5 квітка ‘flower’ 1, листя ‘leaves’ 1, посів t:plant r:concr 43 + ‘sowing’ 1, зілля ‘potion’ 1, рослина ‘plant’ 2, цикута ‘hemlock’ 1, анчар ‘antiar’ 1, гриб ‘mushroom’ 3, колючка ‘thorn’ 1, кураре ‘curare’ 28, плющ ‘ivy’ 2, поганка ‘toadstool’ 1 The word отрута ‘poison’ is followed by 25 + a punctuation mark Гаспид ‘elapid’ 5, чортище ‘devil’ 1, аспид t:animal t:stuff r:concr 16 + ‘elapid’ 2, василіск ‘basilisk’ 5, гідра ‘hydra’ 2, потвора ‘monster’ 1 вогонь ‘fire’ 1, наркотик ‘narcotic’ 1, t:stuff r:concr 14 + актиній ‘sea anemone’ 2, газ ‘gas’ 1, дим ‘smoke’ 1, кисень ‘oxygen’ 1, ліки ‘medication’ 1, діоксин ‘dioxin’ 5, речовина ‘substance’ слід ‘should’ 3, треба ‘must’ 5 r:mod 8 + обличчя ‘face’ 5, лице ‘face’ 1 pt:partb r:concr pc:hum 6 + кров ‘blood’ t:stuff r:concr t:liq 6 + cтріла ‘arrow’ t:tool r:concr top:rod 4 + ікло ‘fang’ 1, голова ‘head’ 2 pt:partb pc:animal r:concr 3 + pc:hum організм ‘organism’ t:living r:concr 3 + організація ‘organization’, рятунок r:concr:org r:abstr 2 + ‘rescue’ мед ‘honey’ 1, абсент ‘absinthe’ 1 t:food t:stuff r:concr 2 + cпецслужба ‘special service’ bbr r:concr t:org 1 + орган ‘part of body’ hi:class pc:hum pc:animal 1 + t:tool:mus страва ‘dish’ op:horiz r:concr t:tool:dish 1 + нирка ‘kidney’ pc:hum pt:partb:organ 1 + r:concr залоза ‘gland’ pc:hum pt:partb:organ 1 + r:concr плоть ‘flesh’ pc:hum t:stuff r:concr 1 + pt:constit гостряк ‘sharp point’ pc:tool:instr r:concr pt:part 1 + pc:tool:weapon джунглі ‘jungle’ pt:aggr t:space r:concr 1 + sc:plant м'ясо ‘meat’ pt:aggr t:stuff r:concr 1 + sc:partb гроші ‘money’ pt:aggr:aggrpl t:money 1 + r:concr sc:money:banknote hi:class жало ‘sting’ pt:partb pc:animal r:concr 1 + місце ‘place’ qc:space t:space r:concr 1 + крапля ‘drop’ qc:stuff pt:qtm r:concr 1 + колода ‘log’ r:concr 1 + клинок ‘blade’ r:concr pt:part 1 + pc:tool:weapon аптека ‘Pharmacy’ r:concr t:org 1 + КДБ ‘KGB’ r:propn t:org 1 + вода ‘water’ r:stuff r:concr t:liq 1 + село ‘village’ sc:constr:build t:space 1 + r:concr хустка ‘shawl’ t:access t:tool:cloth r:concr 1 + top:cover страх ‘fear’ t:degr:max 1 + наркомафія’ drug mafia’ t:group sc:hum pt:aggr 1 + r:concr Александр ‘Alexander’ t:hum r:propn t:persn 1 + регіон ‘region’ t:space r:concr pt:part 1 + pc:space криниця ‘well’ t:space top:mineshaft 1 + t:constr r:concr клітина ‘cell’ t:tool top:contain pc:stuff 1 + pt:element r:concr голка ‘needle’ t:tool:instr t:tool r:concr 1 + d:dim склянка ‘glass’ top:contain r:concr 1 + t:tool:dish місяць ‘moon’ top:horiz top:spher t:astr 1 + t:space r:concr шлях ‘way’ top:stripe t:space r:concr 1 + справа ‘deal’ r:abstr 2 – півмільярда ‘half a billion’ 1, тисяч r:abstr t:card 2 – ‘thousand’ 1 біль ‘pain’ t:physiol r:abstr 2 – метод ‘method’1, номер ‘number’ 1, r:abstr 3 – хіміотерапія ‘chemotherapy’1 параліч ‘paralysis’ r:abstr t:disease 1 – розмір ‘size’ r:abstr t:param t:size 1 – сила ‘power’ sc:hum t:pers r:abstr 1 – pt:aggr:aggrpl t:param r:concr рід ‘family’ t:group pt:set r:concr 1 – sc:hum r:abstr їжа ‘food’ t:physiol r:abstr 1 – смерть ‘death’ t:physiol t:be:disapp r:abstr 1 – видіння ‘vision’ t:psych t:perc r:abstr 1 – The obtained results contain 11 errors. In a number of cases of the literal use of the word отрута ‘poison’ in the right-hand position there are words belonging to the semantic categories/thematic classes which hypothetically should indicate the metaphorical nature of the word отрута ‘poison’: Але спричинений отрутою біль негайно повертається, і птахи знову біжать до води ‘But the pain caused by the poison immediately returns and the birds run to the water again’ (H. Quiroga, Cuentos de la Selva, 1946); Поспішиш — і пожалкуєш. Пекуча отрута болем відізветься у тілі. А кожна бджолина сім'я має свій «характер», потрібні індивідуальний підхід… ‘Hurry up and you'll regret it. The burning poison will resonate in the body. And each bee family has its own "character", requires an individual approach…’ (Journal Ukrainian Beekeeper, 1992); Тут ще зосталась половина, її доволі буде нам, щоб на собаці отрути силу (sc:hum t:pers r:abstr pt:aggr: aggrpl t:param r:concr; GRAC: 1:abst& able: 2:abst:physio& able: 3:abst: 4:abst:param: 5:conc:hum:collect: 6:abst:quantit:max) звідать ‘There is still half left, it will be enough for us to find out the force of poison giving it to the dog’. Собака здохне – тебе живим у землю закопаєм ‘Should the dog die, we will bury you alive in the ground’ (I. Karpenko-Karyi, The evil spark will burn the field and disappear itself, 1893). The accuracy of the results depends on the stylistic type of the text, prose or poetry. “The success of the identification systems varies significantly across genres and sub-classes of metaphor” [24]. False identification is found in texts of scientific or popular science styles that have not been deleted from the search query, such texts are more saturated with abstract terminology: При менших дозах отрути смерть (t:physiol t:be:disapp r:abstr; GRAC: 1:abst:disappear: 2:abst:physio) настає через кілька хвилин. При цьому спостерігаються задишка, судоми, втрата свідомості, відчуття… ‘Given lower doses of poison death comes in a few minutes. It is accompanied by shortness of breath, convulsions, loss of consciousness, sensation’ (I.O. Kontsevich & B.V. Mykhailychenko, Forensic Medicine. Textbook, 1997); Особливо сильно впливає гашиш. При отруєнні цією отрутою видіння виникають одне за одним ‘Hashish has a particularly strong effect. When poisoned by this poison, visions appear one after another’ (M. Rubakin, Among Mysteries and Miracles, 1962); Під час проведення сеансів з отримання отрути методом (r:abstr) електростимуляції ведуть облік кількості отриманої отрути від кожної бджолиної сімї… ‘During the sessions for obtaining poison by the method (r: abstr) of electrical stimulation, the amount of poison received from each bee family is recorded’ (Journal Pasika, 34, 2005), etc. It should be noted that in the model SUBSTANCE NOUN (ОТРУТА, БАЛЬЗАМ, НЕКТАР, МЕД ‘POISON, BALM, NECTAR, HONEY’) + NOUN the second noun is mostly in the genitive case, functioning as a non-agreed attribute. Therefore, the verb in the sentence semantically agrees with the conceptual sphere, in our case, of substance nouns. In other words, the prototype poison is a liquid: В її душу вливалась крапля за краплею гірка отрута книжок, що видавалися масовими тиражами в колишній панській Польщі ‘The bitter poison of books published in mass circulation in the former lordly Poland poured into her soul drop by drop’ (Poison, Free Ukraine, 1940); Таким чином, від жовтеняти до переспілого більшовика в свідомість суспільства вливали гадючу отруту ненависті до героїв визвольних змагань на західноукраїнських землях ‘Thus, from Little Octobrist to the overmature Bolshevik, a viper's poison of hatred for the heroes of the liberation struggle in Western Ukraine was infused into the consciousness of society’ (V. Palyvoda, Memoirs of a Ukrainian Insurgent and Long-Term Gulag Prisoner, 2001); Отрута осередку вливається в дитячі душі і сіє розпусту ‘The poison of the center flows into children's souls and sows debauchery’ (S. Yefremov, Diary, 1925). Sometimes the second component of collocations is in the instrumental case, for example, отрута владою ‘poison by power’; in addition, in the word combination отруту вогнем ‘poison by fire’ the instrumental case depends on the verb випалити ‘to burn’. Thus, the identification of metaphors through the semantics of the verb in this case will be ineffective, in contrast to contexts in which the abstract concept is an agent. In total, 225 metaphors with a component отрута ‘poison’ were identified, 206 metaphors were identified correctly, and 18 were identified incorrectly by semantic categories/thematic classes. We have revealed the following units in the right-hand position in the literal sense (see Table 3). Inaccurate results are obtained in poetic context, for instance: Квіт заполярної півсонної теплиці / Квіт на який пролив бліду отруту місяць / Чиї брати цвітуть на смітниках понурі ‘The flower of the polar half-asleep greenhouse / The flower on which moon shed a pale poison / Whose brothers bloom in the dumps gloomily’ (V. Nezval, Five Minutes Past the Town, 1972 (translated by A. Malyshko)); А неба пес крилатий гострим дзьобом, Омоченим в отруту вуст твоїх, / Мені шматує серце, й виринає / Огидне кодло привидів, поріддя Безодні темрявої снів ‘And the sky's winged dog with a sharp beak, Dipped in the poison of your lips, / Torns my heart to pieces, and emerges / Disgusting brood of ghosts, the offspring of the Abyss of Dark Dreams’ (Yuri Klen in the context of Ukrainian neoclassicism, 2004). In general, in a number of cases we are dealing with extended metaphors, to the analysis of which a different approach should be applied (syntactic roles, analysis of abstraction/sentence specificity): Можливо, але я тільки продовжую гру, яку почав ти; я шукаю на твою отруту протиотруту, я вливаю в тебе стільки недовіри, що ти зможеш виблювати її і здоровим повернутись додому ‘Perhaps, but I only continue the game you started, I'm looking for an antidote to your poison, I'm instilling so much distrust in you that you can vomit it out and come home healthy’ (P. Van Aken, Slapende honden, 1972, translation). Table 3 Metaphorical collocations Lexeme Semantic category/thematic Frequency Result class ненависть ‘hatred’ 14, шовінізм r:abstr 100 + ‘chauvinism’ 7, ідеалізм ‘idealism’ 6, брехня ‘lies’ 5, націоналізм ‘nationalism’ 4, зрада ‘betrayal’ 3, влада ‘power’ 2, гріх ‘sin’ 2, єресь ‘heresy’ 2, перемога ‘victory’ 2, нетерпимість ‘intolerance’ 2, утіха ‘consolation’ 1, заціпеніння ‘numbness’ 1, абстракціонізм ‘abstractionism’ 1, авторитаризм ‘authoritarianism’ 1, більшовизм ‘bolshevism’ 2, біологія ‘biology’ 1, буття ‘genesis’ 1, влада ‘power’ 1, гнів ‘anger’ 1, демократія ‘democracy’ 1, євангелізм ‘Evangelism’ 1, ідея ‘idea’ 1, історія ‘history’ 1, лібералізм ‘liberalism’ 1, лузерство ‘being a loser’ 1, людиноненависництво ‘misanthropy’ 1, малодушність ‘cowardice’ 1, махновщина ‘makhnovism’ 1, мистецтво ‘art’ 1, модернізм ‘modernism’ 1, москвофільство ‘Russophilia’ 1, нацизм ‘Nazism’ 1, недовір'я ‘distrust’ 1, ностальгія ‘nostalgia’ 1, оргія ‘orgy’ 1, патріотизм ‘patriotism’ 1, поп-культура ‘pop culture’ 1, популізм ‘populism’ 1, постправда ‘post-truth’ 1, поцілунок ‘kiss’ 1, презирство ‘contempt’ 1, пропаґанда ‘propaganda’ 1, расизм ‘racism’ 1, реваншизм ‘revanchism’ 1, революція ‘revolution’ 1, сіонізм ‘Zionism’ 1, скептицизм ‘scepticism’ 1, агресивність ‘aggression’ 1, бездуховність ‘spirituality’ 1, безнадія ‘hopelessness’ 1, буденність ‘mundaneness’ 1, в'їдливість ‘causticity’ 1, недбальство ‘negligence’ 1, непевність ‘insecurity’ 1, нечистота ‘impurity’ 1, прикрість ‘annoyance’ 1, імперськість ‘imperialism’ 1, чорнота ‘blackness’ 1, самозневага ‘self-loathing’ 1, залежність ‘dependence’ 1 злоба 'anger’ 4, заздрощі 'envy’ 2, t:psych:emot r:abstr 16 + образа 'resentment’ 2, пристрасть ‘passion’ 1, обурення ‘indignation’ 1, насолода ‘pleasure’ 1, заздрість ‘envy’ 1, жах ‘horror’ 1, злість ‘anger’ 1, мука ‘torment’ 1, страждання ‘suffering’ 1 кар'єризм ‘careerism’ 2, лицемірство t:humq r:abstr 13 + ‘hypocrisy’ 1, милосердя ‘mercy’ 1, себелюбство ‘selfishness’ 1, амбіція ‘ambition’ 1, цинізм ‘cynicism’ 1, честолюбство ‘ambition’ 1, гордощі ‘pride’ 1, допитливість ‘curiosity’ 1, жорстокість ‘cruelty’ 1, зажерливість ‘greed’ 1, мстивість ‘vengeance’ 1 кохання ‘love’ 11, любов ‘love’ 1 t:psych:emot r:abstr t:hum 12 + докір ‘rebuke’ 2, лестощі ‘flattery’ 2, t:speech r:abstr 8 + донос ‘denunciation’ 1, обман ‘deception’ 1, слово ‘word’ 2 втома ‘fatigue’ t:physiol t:psych r:abstr 5 + вищість ‘supremacy’ 1, ворожнеча r:abstr t:poss 5 + ‘enmity’ 1, дружба ‘friendship’ 1, війна ‘war’ 1, контрреволюція ‘counterrevolution’ 1 сумнів ‘doubt’ t:ment t:psych r:abstr 4 + зневіра ‘despair’2, зневір'я ‘despair’ 2 t:ment t:neg r:abstr 4 + інсинуація ‘insinuation’ 1, думка t:ment r:abstr 4 + ‘thought’ 1, правда ‘truth’ 1, спогад ‘memory’ 1 ревнощі ‘jealousy’ t:humq t:psych:emot r:abstr 4 + азарт ‘excitement’ 1, апатія ‘apathy’ 1, t:psych r:abstr 3 + передчуття ‘anticipation’ 1 смерть ‘death’ t:physiol t:be:disapp r:abstr 3 + пізнання ‘cognition’ 1, свідомість t:ment r:abstr 3 + ‘consciousness’ 1, чекання ‘waiting’ 1 ласка ‘please’ 1, посмішка ‘smile’ 2 t:manif:emot r:abstr 3 + катування ‘torture’ 1, насилля t:impact r:abstr 3 + ‘violence’ 2 просвітлення ‘enlightenment’ 2, тління t:changest r:abstr 3 + ‘decay’ 1 життя ‘life’ t:be:exist r:abstr 3 + займенник 1 ‘pronoun 1’ t:word r:concr r:abstr 2 + почуття ‘feeling’ 1, чутка ‘rumor’ 1 t:perc r:abstr 2 + фальш ‘false’ t:sound r:abstr 1 + сон ‘sleep’ t:physiol r:abstr 1 + позиція ‘position’ t:loc r:abstr 1 + вмирання ‘dying’ t:be:disapp r:abstr 1 + тон ‘tone’ t:abstr 1 + книжка ‘book’ r:concr t:text 1 + ніч ‘night’ r:abstr pt:part t:time 1 + Всесвіт ‘Universe’ 1, осередок ‘centre’ 1 t:space r:concr 2 – серце ‘heart’ pc:hum pt:partb:organ 2 – r:concr pc:animal місяць ‘moon’ top:horiz top:spher t:astr 1 – t:space r:concr сльоза ‘tear‘ t:stuff r:concr t:liq 1 – протиотрута ‘antidote’ t:stuff r:concr hi:class 1 – особистість ‘personality’ t:hum ev r:concr ev:nonev 1 – військо ‘troops’ t:group pt:aggr:aggrpl r:concr 1 – sc:hum п'явка ‘leech’ t:animal r:concr 1 – місто ‘city’ sc:constr:build t:space r:concr 1 – Петербурґ ‘Petersburg’ r:propn t:topon 1 – кінотеатр ‘cinema’ r:concr t:org 1 – ряд ‘row’ pt:set sc:x r:concr r:abstr 1 – народ ‘people’ pt:set sc:hum r:concr 1 – вуста ‘lips’ pt:partb ev r:concr pc:hum 1 – недруг ‘foe’ pt:ind sc:hum pt:aggr t:hum 1 – r:concr руїна ‘ruin’ pt:aggr:aggrpl sc:constr 1 – r:concr The lexemes серце, кров, сльози ‘heart, blood, tears’ in fiction texts can acquire symbolic meaning. “Organs such as heart, blood, tears etc. are symbols verbalizing relevant mental and intellectual states (verbalization is based on the idea that these organs can be both producers and patients)” [19]. Thus, for literary texts it is important to take into account the symbolic meaning (which can be the result of both metaphorization and metonymization), in particular, a number of somatisms: Весь в свого батька! — виказує жінка те, що завжди в неї на вустах. То її власна отрута серця, яку, до речі, мати виробляє для себе і за власним рецептом ‘Like father like son! — The woman utters what is always on her lips. It is her own poison of the heart, which, by the way, mother produces for herself and according to her own recipe’ (N. Odala, Who are you and what are you doing here, 2011). In GRAC, the word серце ‘heart’ is semantically tagged as 1:conc:body:part: 2:abst:psych:emot: 3:abst: 4:conc:hum:posit, which indicates that the word has a number of meanings and ‘abstract’ ones are second and third in frequency. It is important to note that this tag does not imply a negative evaluation. However, in the phraseological systems of the Ukrainian language, the concept of ГНІВ ‘ANGER’ is also verbalized through the idea of changes in the heart functioning (the word серце ‘heart’ is synonymous with гнів, злість, пересердя, іритація, пасія ‘anger, rage, anger, irritation, passion’), increased blood pressure and blood pollution: Ukr. серце набіга ‘heart raids’; серця додати ‘to add heart’ (cf. Ukr. докладати/докласти душі (серця) ‘apply soul (heart) to sth’ – Belarus. сэрца мець на каго ‘to have heart on someone’ — Bulg. сърцето му се налива с кръв ‘his heart is filled with blood’ [19]. Contexts such as ‘Гостя кивала головою і втирала солону отруту сліз, а коли йшла, то їй уже було легше’ ‘The guest shook her head and rubbed salty poison of tears, and when she walked, it was already easier for her’ (O. Pechorna, Sinner, 2011) prove that the word сльоза ‘tear’ should be annotated with the semantic tag t:manif:emot in the corpus. We have revealed some interesting examples regarding the semantic categories: metaphor: …Ми перемогли місто! Святкуй зі мною нашу перемогу! Тепер ми можемо будувати своє життя, вільне від отрути міста (sc:constr:build t:space r:concr; GRAC 1:conc:loc:container)!.., при цьому виникає специфічний вибір оповідної програми на підставі пізнавальних просторів… ‘We defeated the city! Celebrate our victory with me! Now we can build our lives free from the poison of the city!.., and there is a specific choice of narrative program based on cognitive spaces…’ (T. A. Marchak, Ideological and artistic specificity of G. Mykhailychenko's sketch The City, 2010); Отрута Петербурґа, міста туманів і примар, міста підступної “імітації Європи” – вже зробила своє; Вона, як пошесть, понесла туди отруту руїн. Ти мало напився її ? — Але що ж тут робити, тату? Ми ж помремо! ‘The poison of St. Petersburg, the city of mists and ghosts, the city of insidious "imitation of Europe" — has already done its thing; Like a plague it carried the poison of ruins there. Have you drunk enough of it? – But what shall we do, Dad? We will die!’ (M. Ivchenko, The last minutes, 1919); non-metaphor: Ще тисячі людей були змушені залишити свої оселі й переїхати з уражених отрутою сіл (sc:constr:build t:space r:concr; GRAC 1:conc:loc:container: 2:conc:loc: surface: 2:conc:hum:collect) у безпечні регіони. ‘Thousands more people were forced to leave their homes and move from poisoned villages to safe regions’ (on-line newspaper; Ukraina Moloda, 2011). The dictionary definitions of the words місто, село ‘city, village’ do not explain the binary opposition city – village, given in fiction and philosophical texts, where the units denoting loci are used in the civilizational sense. Moreover, the word село ‘village’ is used as a metonymy, which is reflected in the dictionary definition and, accordingly, in the semantic tagging of GRAC. Marking the ambiguity of words in tagging represents the semantics accurately (дерево ‘tree’ |1:conc:plant: higherclass: 2:conc:stuff| життя ‘life’ |1: abst:exist: 2:abst:time: 3:abst|), but requires algorithms of contextual distinction between ambiguity and homonymy (as in the case of the word руїна ‘ruin’, the ambiguity of which is given in the dictionaries) and, accordingly, clarification of the metaphor identification methodology, which should take into account the configurations of semantic tags. One of the compilers of the GRAC corpus states: “In cases of ambiguity, ie, when a lemma may have more than one set of semantic tags due to being used in multiple senses, all such sets are listed in the semantic lexicon, leaving the problem of semantic disambiguation for later stages” [Starko 2020]. The solution to this problem can be additional verification of individual results using the method proposed in [10]. The formal approach does not work when the word отрута ‘poison’ collocates with words that belong to the thematic class pt:set | pt:aggr – group and collective objects (меблі, людство ‘furniture, humanity’): Усе ж разом узяте виконує одну й ту ж роль духовної отрути народу (pt:set sc:hum r:concr; GRAC 1:conc:hum:group:higherclass) ‘All together performs the same role of spiritual poison of the people’ (V. Koptilov, Letters from Paris; letter G6; devilry and mysticism, 1974); До того, що все краще в нашій історії сфальшовано, заплямовано зміїною отрутою недругів. ‘To the fact that all the best in our history is falsified, tainted with snake poison of enemies’ (A. Palamar, Degradation, 2004); …веде своє, ще не порушене революційною отрутою військо (t: group pt:aggr:aggrpl r:concr sc:hum; GRAC 1:conc:hum:group:collect) ‘…leads its army, which has not yet been violated by the revolutionary poison’ (S. Mazlakh, V. Shakhray, The union of proletariat and bourgeoisie against world imperialism, 1919). In some cases, the clarifying factor for identification may be the left-hand environment: духовної отрути, революційною отрутою ‘spiritual poison, revolutionary poison’. We have also revealed not very common cases when the word combination отрута ‘poison’ + collocate (t:animal r:concr) is metaphorical: …гроші, лицемірство, багатство, блиск і зовнішній шик брали верх над усім. Ця отрута п'явкою впивалася в її свідомість, точила її серце ‘…money, hypocrisy, wealth, brilliance and external chic prevailed over everything. This poison like a leech penetrated her consciousness, sharpened her heart’ (Poison, Free Ukraine, 1940). This is a typical example of an extended metaphor. This metaphor is lexicographically fixed: впиватися (впитися) п'явкою (як п'явка) в серце; мов (немов, наче і т. ін.) п'явки за серце ссуть ‘to get like a leech into the heart; like (as if, etc.) leeches suck the heart’ [28]. Another example shows the generally typical phenomenon of a combination of metaphor and metonymy – metaphthonymy: Наприклад, духовну смерть широких мас, що вдень тяжко працюють, а ввечорі засуджені на отруту кінотеатру і телевізії, вони вважають цілком нормальною. ‘For example, the spiritual death of the masses, who work hard during the day and in the evening are doomed to the poison of cinema and television, they consider quite normal’ (C. Miłosz, The captive mind, 1983, translated by B. Struminskyi). We observe comparable lists of semantic tags of noun-collocates for metaphors and non- metaphors analyzing implementations of the model БАЛЬЗАМ, НЕКТАР, МЕД ‘BALM, NECTAR, HONEY’ + noun. Obviously, the frequency of specific semantic tags of noun-collocates are different for each of the analyzed words, as well as the ratio of metaphors and non-metaphors (see Table 4). The most frequent of the analyzed words are the least metaphorical. Table 4 The ratio of metaphors and non-metaphors Substance noun Frequency in the Total number of Non-metaphor Metaphor corpus collocations (absolute/relative) substance noun + noun отрута ‘poison’ 9,408 (14.40 ipm) 646 418 (64,7%) 228 (32,29%) бальзам ‘balm’ 1,365 (2.09 ipm) 63 34 (53,97%) 29 (46,03%) мед ‘honey’ 26,326 (40.40 ipm) 1378 1221 (88,61%) 157 (11,39%) нектар ‘nectar’ 1,763 (2.42 ipm) 135 91 (67,41%) 44 (32,59%) We have revealed a high percentage of metaphors among the collocations with the component бальзам ‘balm’: Не страх за шпигунство Галкіна, ні сподівання «приказа» вже не триволоіли його; жінчина любов, тихі, розумні розмови з тестем наче бальзам спокою вливались в його серце... ‘Neither the fear of Galkin's espionage nor the hope of the "order" bothered him; a woman's love, quiet, intelligent conversations with his father-in-law like a balm of peace poured into his heart…’ (O. Konyskyi, Yuriy Gorovenko. Chronicle of Troubled Times, 1883); І, думав Піфагор над смислом земного буття, болісно перебирав у свідомості мудрі вислови філософів та шукачів Істини і не міг покласти на рану серця цілющого бальзаму заспокоєння… ‘And, Pythagoras thought about the meaning of earthly existence, he painfully went over in his mind the wise sayings of philosophers and seekers of Truth and could not put on the wound of the heart the healing balm of soothing…’ (O. Berdnyk, Veil of Isis, 1969); Взагалі мрія це бальзам душі ‘In general, a dream is a balm of the soul’ (F. Odrach, On uncertain ground); І тепер його душа колисалася на хвилі світлої печалі, купалась у бальзамі співчуття, який так щедро виливала на нього Мері ‘And now his soul was floating in a sad serenity. It was embalmed in the sympathy that Mary so generously poured’ (Aldous Huxley, Crome Yellow, translated by V. Vyshnevyi, 1978). It should be noted that in all the above sentences we can see extended metaphors and/or a combination of metaphor and metonymy, but in this paper, we consider only the minimal metaphorical context. Interestingly, metaphorically synonymous terms бальзам ‘balm’ and нектар ‘nectar’ verbalize the concepts which only partially intersect: Може, тобі здавалося принизливим почуття дівчини, котру ти ще недавно напував нектаром знання, зібраним з квіток всіх віків і народів? ‘Perhaps you found humiliating the feelings of the girl you recently drank the nectar of knowledge collected from flowers of all ages and peoples?’ (O. Berdnyk, The darkness does not ignite the hearth, 1993); – Ви п'яні, правда? – перебила мавка. — Я п'яний нектаром кохання ‘– You're drunk, aren't you? – The maw interrupted. – I am drunk with the nectar of love’ (O. Turyansky, Son of the Earth, 1933); …вона, справді, мов та бджола, невтомно трудилася – несла своєму народові цілющий нектар освіченості і культури – попри всі безкінечні імперські утиски. ‘She, indeed, like a bee, worked tirelessly – carried to her people the healing nectar of education and culture – despite all the endless imperial oppression’ (I. Kuchernyuk, Magazine Native Land in socio-political and cultural life of Ukraine (1905-1916.), 2016); Геній Карла XII пірвав їх, як гураган, вихопив з рідних хат, з обіймів батьків, жінок і дітей і нестримним летом ніс назустріч невідомих подій у невідомих краях, і як безпритомних, як напоєних узваром забуття і задурманених нектаром слави, кидав в обійми терпіння, каліцтва і смерті ‘The genius of Charles XII tore them apart like a hurricane, snatched them from their homes, from the arms of parents, wives and children, and carried them to unknown events in unknown lands, and as unconscious, as intoxicated with forgetfulness and intoxicated by the nectar of glory throw them in the arms of ordeal, injuries and deaths’ (B. Lepkyi, Poltava, 1929) Collocations with the component мед ‘honey’ are metaphorized the least frequently. Typical are collocations with the components кохання, знання, вчення ‘love, knowledge, teaching’: В духмяні ложа манила лукава гречка. Медом кохання гусли очі, як соти… ‘Sly buckwheat beckoned in the fragrant bed. The honey of love filled the eyes like honeycombs’ (I. Kalynets, Dance of Thirst, 1964); ...вдивлявся в шафи і скриньки, які ховали в собі різні папки, картотеки, картки — щільники, повні гіркого меду знання, зібраного з тисяч людських уст, або як я ще називав їх у хвилини гіркоти й досади, катакомби людського життя – пояснював, свідчив, ставив підписи… ‘Looked at the closets and boxes with various folders, files, cards – seals, full of bitter honey of knowledge gathered from thousands of human lips, or as I called them in moments of bitterness and annoyance, the catacombs of human life – explained, testified, signed’ (H. Auderska, The fruit of the pomegranate tree, translated by D. Andrukhiv, 1978); Ця війна, – каже Ань, – зібрала гіркий мед досвіду тисячолітньої боротьби ‘This war, says An, gathered the bitter honey of the experience of the millennial struggle’ (B. Dymytrova, translated by M. Syngaivskyi, Underground Sky. Vietnamese Diary – 72, 1973); Той, хто не куштує меду життя з глека смутку, не розуміє, що таке життя… ‘He who does not taste the honey of life from the pitcher of sorrow does not understand what life is like…’ (S. Tkachuk, Kaleidoscope, 1985). In our opinion, to optimize the process, any automated identification of metaphors should begin with building a database of stabilized metaphors, because in many cases, considering the minimum metaphorical context will not give the expected results. Such stabilized metaphors are phraseologiсal units that vividly illustrate collocations with the component мед ‘honey’ (although in modern texts, given the stability, they are used in a transformed form): Не варто псувати бочку меду ложкою дьогтю, – сказав Дінні ‘You should not spoil a barrel of honey with a spoon of tar, – said Danny…’ (K. S. Pritchard, The Roaring Nineties, translated by L. Solonko, 1985); Багатьом здається, що при владі дають мед ложкою їсти ‘It seems to many that the authorities get a spoonful of honey to eat’ (Internet newspaper; Vysokyi Zamok, 2006). Thus, the study revealed conceptual domains that are blended during metaphorization. We calculate the semantic distance between words that verbalize the concepts in question (using [29]). “In distributional semantics, words are usually represented as vectors in the multidimensional space of their contexts. Semantic similarity is calculated as the cosine proximity between vectors of two words and can have values within [-1 ... 1] (in practice, only values above 0 are often used). A value of 0 means that these words do not have similar contexts and their meanings are not related to each other. The value 1 indicates full identity of contexts and, consequently, the proximity of meaning” [29]. Table 5 Semantic distance between words Word Бальзам 'balm' Мед ‘honey’ Отрута ‘poison’ нектар ‘nectar’ 0.45107579594507313 0.6208797585487278 0.4348118568001384 мед ‘honey’ 0.4771952087811581 — 0.4683109649419713 бальзам ‘balm’ — 0.4771952087811581 0.5028191848746367 Indices of semantic similarity of the substance nouns are within the range of 0.43-0.62. The indices of semantic similarity of the words included in the minimal metaphorical contexts are much lower (0.002-0.33), the range of indices of non-metaphorical collocations is 0.097-0.53 (see table 6). On the one hand, vector analysis shows valuable information about interconceptual distance, indicating a tendency to metaphorization for words that are at a ‘greater distance’, and on the other hand, it can be used only as a supplementary parameter to identify metaphors. Table 6 Indices of semantic similarity of the substance nouns Metaphorized Отрута ‘poison’ Metaphorized Отрута ‘poison’ collocate collocate злість ‘anger’ 0.3308868957997684 кохання ‘love’ 0.12891872358970644 злоба ‘evil’ 0.3011577365356576 ворожнеча ‘hostility’ 0.11998252575390438 заздрощі ‘envy’ 0.3011577365356576 сумнів ‘doubt’ 0.10977057000931573 брехня ‘lie’ 0.2681072695933819 апатія ‘apathy’ 0.09872702565220001 ревнощі ‘jealousy’ 0.2618815477411945 передчуття 0.09519532895984645 ‘anticipation’ пристрасть ‘passion’ 0.2576043337562662 любов ‘love’ 0.0935134304568005 жах ‘horror’ 0.2566227403508865 націоналізм 0.09182481239332142 ‘nationalism’ ненависть ‘hatred’ 0.25424900903838155 спогад ‘memory’ 0.09175144267229594 заздрість ‘envy’ 0.2408787394098845 зрада ‘betrayal’ 0.0858440026561671 азарт ‘hazard’ 0.2342687041056827 життя ‘life’ 0.08254863717627696 смерть ‘death’ 0.20957808950454004 зневіра 0.08100316401306822 ‘despondency’ образа ‘insult’ 0.20518826793416997 війна ‘war’ 0.07417490885008761 шовінізм ‘chauvinism’ 0.1529430182452781 думка ‘thought’ 0.03801466047117183 правда ‘truth’ 0.1474135213593364 зневір'я ‘despair’ 0.010143498359646768 інсинуація ‘innuendo’ 0.14187039801692342 ідеалізм ‘idealism’ 0.001699740009861427 Non-metaphorized Отрута ‘poison’ Non-metaphorized Отрута ‘poison’ collocate collocate кураре ‘curare’ 0.5328772920494754 стріла ‘arrow’ 0.3660649852595244 гадюка ‘viper’ 0.480684355543558 рослина ‘plant’ 0.3549136298862671 павук ‘spider’ 0.4800154419184869 цикута ‘hemlock’ 0.3339055131871234 комаха ‘insect’ 0.47705181746284386 ікло ‘fang’ 0.32989676484625086 кров ‘blood’ 0.47298034138952566 пацієнт ‘patient’ 0.315834755363813 мед ‘honey’ 0.4683109649419713 скорпіон ‘scorpio’ 0.3136195716421568 організм ‘organism’ 0.468012262219308 людина ‘man’ 0.30783231082861556 гриб ‘mushroom’ 0.4275471340564579 кобра ‘cobra’ 0.30650560129434 каракурт ‘karakurt’ 0.426857503815521 колючка ‘thorn’ 0.2696868590240854 змія ‘snake’ 0.42227195451628796 обличчя ‘face’ 0.17895976611022407 абсент ‘absinthe’ 0.42054481184142783 рятунок ‘rescue’ 0.17026515358561461 бджола ‘bee’ 0.41244825131368734 лікар ‘doctor’ 0.15915295896413628 тварина ‘animal’ 0.40910802571132343 лице ‘face’ 0.15803219227075133 гюрза ‘gurza’ 0.3720726981888439 анчар ‘antiar’ 0.09715284384975119 4. Conclusions Thus, distributional semantic analysis of collocations of the formal model NOUN (t:stuff r:concr) + NOUN gives 90–94.89% of accurate results. The findings show that the collocations of the model NOUN (t:stuff r:concr) + NOUN (r:abstr; t:psych, t:speech, t:word, t:text, t:physiol, t:ment, t:manif:emot, t:be:exist, etc.) are metaphorical. Identification procedures require additional clarification for the collocations of the model NOUN (t:stuff r:concr) + NOUN (t:space, t:hum, t:group, t:topon, etc.) in fiction texts. The interpretation of collocations that include somatic symbols is problematic by the classification of nouns into semantic categories/thematic classes used in this study. Furthermore, it has been revealed that the model NOUN (t:stuff r:concr) + NOUN (t:disease, t:size, t:param, t:physiol, t:psych, t:perc) is non-metaphorical in scientific and popular science texts. Semantically annotated corpora provide a basis for creating techniques for automatic/automated metaphor identification. Obviously, specific algorithms for metaphor identification depend on the principles of semantic tagging used in a particular corpus of texts. However, the starting point of identification is the idea of a ‘metaphorical hierarchy’: the concepts that are at higher levels are metaphorized in terms of the concepts of lower levels; the greater the interconceptual distance, the more likely the creation of metaphorical meaning is. The distribution of conceptual verbalizers at different levels (by a certain semantic category/thematic class) allows us to describe formal models of potential metaphors. 5. References [1] T. Strzalkowski, G. A. Broadwell1, S. Taylor, L. Feldman1, B. Yamrom1, S. Shaikh, T. Liu, K. Cho, U. Boz, I. Cases, K. Elliott, Robust Extraction of Metaphors from Novel Data, in: Proceedings of the First Workshop on Metaphor in NLP, The American Association for Computational Linguistics (NAACL-2013), Atlanta, USA, 2013, pp. 67-76. [2] T. Veale, E. Shutova, B. Klebanov, Metaphor: a computational perspective, Synthesis Lectures on Human Language Technologies, Morgan Claypool, San Rafael, California, 2016. [3] Yu. Badryzlova, Experience of corpus modeling of factors of metaphoricity based on Russian verbs, Computational linguistics and intellectual technologies: on the materials of the international conference "Dialogue 2017" Moscow, May 31 – June 3, 2017. URL: http://www.dialog-21.ru/media/3898/badryzlovayug.pdf [4] O. Levchenko, N. Romanyshyn, Modern approaches to automated identification of metaphor, Bulletin of Lviv University, Philological series 70 (2019) 288–298. [5] E. Shutova, L. Sun, A. Korhonen, Metaphor identification using verb and noun clustering, Proceedings of the 23rd International Conference on Computational Linguistics, Beijing, 2010, pp. 1002–1010. [6] P. D. Turney, Y. Neuman, D. Assaf, Y. Cohen, Literal and metaphorical sense identification through concrete and abstract context, in: The 2011 Conference on Empirical Methods in Natural Language Processing, Edinburgh, 2011, pp. 680–690. [7] M. Coltheart, The mrc psycholinguistic database, The Quarterly Journal of Experimental Psychology Section A 33(4) (1981) 497–505. [8] M. Wilson, Mrc psycholinguistic database: Machine-usable dictionary, version 2.00, Behavior re-search methods, instruments, & computers, 20(1) (1988) 6–10. [9] G. J. Steen, A. G. Dorst, J. B. Herrmann, A. Kaal, T. Krennmayr, VU Amsterdam Metaphor Corpus, 2010. URL: http://ota.ahds.ac.uk/headers/2541 [10] O. Levchenko, N. Romanyshyn, D. Dosyn, Method of automated identification of metaphoric meaning in adjective + noun word combinations (based on the Ukrainian language), in: CEUR Workshop Proceedings, Vol. 2386, Workshop proceedings of the 8th International conference on Mathematics. Information technologies. Education, MoMLeT&DS 2019, pp. 370–380. [11] O.Y. Petruk, O.P. Levchenko, Identification of the metaphor model білий 'white' + noun by the method of quantitative analysis of dictionary definition, Young scientist, 10 (74) (2019) 505-511. [12] Ya. Smuzhanytsia, Automated identification of the metaphor model гіркий/солодкий/прісний 'bitter/sweet/fresh' + noun : Master's thesis, 2020. [13] Yu. Kyrylyuk, Algorithm of automatic identification of zoomorphic metaphor, Master's thesis, Lviv, 2020. [14] V. Starko, Semantic Annotation for Ukrainian: Categorization Scheme, Principles, and Tools, COLINS, 2020, pp. 239-248. [15] M. Shvedova, R. von Waldenfels, S. Yarygin, A. Rysin, V. Starko, M. Wozniak, M. Kruk, General regionally annotated corpus of the Ukrainian language (GRAC), Kyiv, Lviv, Yena, 2017-2021. [16] National Corpus of the Russian Language, 2003–2006. URL: https://ruscorpora.ru [17] G. Sklyarevskaya, Metaphor in the language system, Ros. AN, Institute of Linguistic research, Science, St. Petersburg, 1993. [18] N.D. Arutyunova, Language and the human world, Languages of Russian culture, Moscow, 1999. [19] O.P. Levchenko, Symbols in phraseological systems of the Ukrainian and Russian languages: a linguo-cultural aspect, Manuscript, The dissertation on competition of a scientific degree of the doctor of philological sciences in specialty 10.02.01 the Ukrainian language, 10.02.02 the Russian language, Institute of Linguistics, O.O. Potebnya NAS of Ukraine, Kyiv, 2007. [20] V. M. Bilonozhenko, L. S. Palamarchuk et al. (Eds.), Phraseological dictionary of the Ukrainian language, Naukova dumka, Кyiv, 1993. [21] A. Koshelev, M. Leonidova (Eds.), Bulgarian-Russian phraseological dictionary, Science and Art, Moscow, Sofia, 1974. [22] SJP – Słownik języka polskiego, Wydawnictwo Naukowe PWN SA, 2003. [23] Bulgarian Explanatory Dictionary, Eurodictxp. URL: http:// koralsoft.dir.bg/dict.php [24] J. Dunn, How linguistic structure influences and helps to predict metaphoric meaning, Cognitive Linguistics 24(1) (2013) 33-66. [25] N. Fairclough, Critical Discourse Analysis: the Critical Study of Language, Longman, London/ New York, 1995. [26] J. Charteris-Black, Corpus Approaches to Critical Metaphor Analysis, Palgrave Macmillan UK, 2004. [27] O. Levchenko, Phraseological symbolism: linguo-cultural aspect, Lviv, 2005. [28] Dictionary of the Ukrainian Language, vol. 8, pp. 416. URL: http://sum.in.ua/p/8/416/1 [29] Comprehensive information system of scientific research "Automated workplace of a researcher". URL: http://icybcluster.org.ua:34145/