=Paper= {{Paper |id=Vol-2604/paper8 |storemode=property |title=Quantitative Parameters of Some Novellas by Roman Ivanychuk |pdfUrl=https://ceur-ws.org/Vol-2604/paper8.pdf |volume=Vol-2604 |authors=Ihor Kulchytskyy |dblpUrl=https://dblp.org/rec/conf/colins/Kulchytskyy20 }} ==Quantitative Parameters of Some Novellas by Roman Ivanychuk== https://ceur-ws.org/Vol-2604/paper8.pdf
    Quantitative Parameters of Some Novellas by Roman
                       Ivanychuk

                                      Ihor Kulchytskyy
                    Lviv Polytechnic National University, 12 Bandera street,
                                     Lviv, Ukraine, 79013
                                     bis.kim@gmail.com



        Abstract. Nowadays there are many approaches and methods in the field of mod-
        ern linguistics, although there has been an increasing tendency towards using
        quantitative methods for research. It is believed that on the verge of the two
        branches, namely linguistics and statistics, the modern scholars can obtain the
        most accurate and up to date results. This paper deals with the statistical analysis
        of the novellas written by the renowned Ukrainian writer Roman Ivanychuk. The
        analysis of the linguistic text by the means of statistics provide an in-depth per-
        spective on the specific style of writing of the author.

        Keywords: statistical analysis, quantitative parameters, novellas, idiolect,
        corpus linguistics


1       Introduction

At the current stage of the development of linguistics, the use of the electronic
corpus of texts has become an integral part for many researches devoted to the individ-
ual style of author. Corpus linguistics is a methodology of linguistics that consists of
computer-based empirical analysis (both quantitative and qualitative) of actual models
of language usage, using large-scale collections of naturally occurring spoken and writ-
ten texts available in electronic form, called corpora. An electronic corpus of texts if a
useful tool for language learning, texts attribution and historical research of some lin-
guistic phenomenon. The focus of this paper is on the individual style of writing of
Roman Ivanychuk researched by the means of statistics in order to find some distinctive
features of the Ivanychuk’s writing as it is believed that he possessed an indeed special
manner of writing and he has a passion to use extremely long sentences in his writing
comparing to other Ukrainian authors. The results of the research will be useful for text
attribution, language learning and historical research of the Ukrainian language.


2       The Interrelation of Corpus Linguistics, Statistics and
        Idiolect

From the historical standpoint, the use of quantitative criterion in the linguistic studies
has long been among the most relevant applied methods of linguistic research. Looking
    Copyright © 2020 for this paper by its authors.
    Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
back to the XX century, it was Ferdinand de Saussure who one of many laid the foun-
dations of such research methods [3, p. 123]. Later on, the evolvement of machine
translation significantly spread up the use of mathematical methods in linguistics.
   In the course of word processing for their input into the machine, various quantitative
estimates of some particular features of language were obtained, which proved to be
useful not only for the creation of mathematical language models, but also for linguistic
theory. Since language is a probabilistic rather than a well-defined system, quantitative
methods are needed to identify it, related to the study of probabilistic, gradual, fre-
quency, and other illogical features.
   When the texts were properly processed for further work in the computers, different
quantitative indicators of the separate linguistic features were obtained. They turned
out to be useful not only for creation of certain mathematical models, but for the lin-
guistic theory in general. Since language is a probabilistic rather than a well-defined
system, quantitative methods are getting more important aiming at proper identification
of its specific features [11, p. 139].
   Statistics is a mathematical science which purpose is to collect, analyze, explain,
demonstrate and interpret data. Statistical methods also broadly used in the corpus lin-
guistics as well. They have become one of the most efficient and time-saving tools of
processing different sets of texts.
   Since corpus linguistics is based on conducting linguistic analyzes, it can be used to
explore many types of language issues, and it has the potential to generate interesting,
fundamental, and often unexpected new perspectives on language. That is why corpus
linguistics has become one of the most widely used methods of linguistic research in
recent years.
   Text corpus can be defined as a systematic set of natural texts (both written and
spoken). The term systematicity means that the structure and content of the corpus com-
ply with certain extra-linguistic principles (e.g. sampling principles on the basis of
which the included texts were selected).



3      Material: Collection, Organization and Methods of Research


The material for the research is the following novellas of Roman Ivanychuk: “I zemlia,
I zelo, I pisnia” (“And earth, and green, and song”) (further in the text this novella will
be referred to as RI1) [4], “Lisova povist” (“Forest story”) (RI2) [5], “Nespokutne”
(“No Atonement”) (RI3) [6], “Solo na fleiti” (“Flute Solo”) (RI4) [7]. To stick with the
general requirement for the publication, the novellas titles are also presented in the au-
thor’s translation into English.
   First of all, the texts of the given novellas were converted in electronic form with the
help of the ABBYY Fine Reader software and saved in .docx format. The next step was
the normalization of the texts in the MS Word editor. The normalization meant bringing
the text in full compliance with the original, arranging the spelling and punctuation of
the text in accordance with the spelling standards [15], marking all foreign words with
the relevant languages, etc.
   The received normalized texts were formalized with the help of R2U software, ac-
cess granted by Vasily Starko [14].
   The results of the automatic lemmatization have been converted to the required for-
mat using native Python applications and have been validated and corrected with MS
Access.
   The next step was to structure the text using XML-style tags [10]. The following
structural elements were distinguished:

• paragraph - 

; • sentences - ; • character language - ; • epigraph - ; • the text of the epigraph - ; • source of the epigraph - ; • the beginning of the original page with the number - ; • place and date of writing - . The normalization, text recognition and verification of the automatic lemmatization were done within the master's thesis by the graduate student of the Department of Ap- plied Linguistics of the Lviv National Polytechnic University Victoria Ogorodnik [12]. The received texts and the results of the lemmatization were subjected to statistical analysis. Statistics are calculated using standard methods and formulas adopted for mathematical statistics [Beginning Statistics]. The necessary software for analysis is written in Python language. For the general statistical research of the abovementioned novellas, the following coefficients were calculated [2; 8; 13]: Vocabulary richness. It is also called the diversity factor/coefficient. The greater the value of this indicator is, the more different words in a particular text can be found. It is calculated as the ratio of the number of words in the text to the number of words usage. Average word repetition in text. It shows how many times each word is used in the text. It is calculated as the ratio of word usage to word count. Exclusivity ratio. This indicator characterizes the variability of vocabulary. It is cal- culated separately for the text (the ratio of the number of word forms that are encoun- tered in the text once to the total number of word forms) and for the vocabulary (the ratio of the number of words that are encountered in the text once to the total number of words). Vocabulary concentration coefficient. This indicator is opposite to the exclusivity ratio. If for text, it is calculated as the ratio of the number of word forms that encoun- tered in the text 10 or more times. Accordingly, for a text vocabulary, it is calculated as the ratio of the number of words that have appeared in the text 10 times or more to the total number of words. The relatively small number of high-frequency vocabulary (low concentration ratio) and the relatively large number of words with frequency 1 (high exclusivity ratio) tend to indicate a considerable variety of vocabulary. Automatic readability index (ARI) is a degree of readability of texts, the ratio of characters in the word and the number of sentences is calculated according to the for- mula: ARI = 4,71 * C / W + 0,5 * W / (S * 3) - 21,43, where C stands for characters, W for words and S for sentences. Coefficient of lexical density is calculated as the ratio of the number of word forms of independent parts of speech in the text to the total number of word forms. Adjectives to nouns ratio is also called the coefficient of epithelization. It is calcu- lated as the ratio of the number of uses in the text of adjectives to the number of uses of nouns. Adverb to verb ratio is the ratio of the number of uses of adverbs to the number of uses of verbs. Nouns to verbs ratio is computed as the ratio of the number of uses of nouns to the number of uses of verbs. Verbs to total number of words ratio is also known as aggressiveness ratio and is counted as the ratio of the use of verbs to the total number of all words in the text. Coefficient of logical connectivity (conjunctions and prepositions to total number of sentences ratio) is basically calculated as the ratio of the number of uses of conjunc- tions and prepositions to the total number of sentences in the text. Coefficient of speech “embolism” (clogging) (or exclamations & particles to total number of words ratio) is calculated as the ratio of the number of uses of exclamations and particles to the total number of words used. Adjectives to nouns ratio, adverb to verb ratio, nouns to verbs ratio, and verbs to total number of words ratio generally define and partially describe the style of the no- vella. If the nouns to verbs ratio is bigger than 1, one can assume that the text is narra- tion (or is written in nominal style). Adjectives to nouns ratio (the number of adjectives to one noun) in the nominal style indicate the degree of a fiction style (as far as the text can be considered a fiction). This is due to the fact that adjectives are the main mean of the figures of speech expressions namely such as epithets and comparisons because of their relations with nouns. Verbs to total number of words ratio (also known as aggressiveness ratio) determines the ratio of the number of verbs and verb forms (adjectives and adverbs) to the total number of all words. High aggressiveness indicates high emotional intensity of the text, dynamics of events, intense emotional state of the author when writing the text. A logic ratio of magnitudes within 1 provides a sufficiently harmonious link between auxiliary parts of speech and syntax constructions. With a nominative ratio of less than 1 and a high verb ratio, we state the verbal idiostyle of the work, and the verb ratio (the number of adverbs per verb) indicates the level and number of speech figures used. 4 The Discussion of the Results of the Statistical Analysis of Novellas by Roman Ivanychuk The general statistical indicators of the researched novellas: the researched novellas have the following general statistical indicators (table 1): Table 1. Statistical indicators used in the research Novellas Statistical Indicators RI1 RI2 RI3 RI4 Number of word usage 8775 7523 5098 4376 Number of word forms 3938 3472 2520 2178 Number of words 2614 2444 1825 1648 Hapax legomenon for word forms 2915 2570 1940 1660 Number of word forms used 10 times or more 101 76 50 46 Hapax legomenon for words 1636 1542 1213 1127 Number of words used 10 times or more 127 109 74 63 Number of letters in the text 43873 40222 25917 22819 Number of sentences in the text 398 165 168 105 The words distribution and the number of words according to parts of speech is pre- sented as below. The results of the carried-out research have shown that the novella “And earth, and green, and song” contains the following parts of speech: Words: noun — 974 (37,26%); verb — 759 (29,04%); adjective — 458 (17,52%); adverb — 173 (6,62%); pronoun — 70 (2,68%); gerund — 50 (1,91%); preposition — 45 (1,72%); conjunction — 39 (1,49%); particle — 26 (0,99%); numeral — 14 (0,54%); exclamation — 5 (0,19%); present participle — 1 (0,04%). Words usage: noun — 2697 (30,74%); verb — 1478 (16,84%); adjective — 833 (9,49%); adverb — 362 (4,13%); pronoun — 976 (11,12%); gerund — 56 (0,64%); preposition — 956 (10,89%); conjunction — 937 (10,68%); particle — 435 (4,96%); numeral — 30 (0,34%); exclamation — 14 (0,16%); present participle — 1 (0,01%). “Forest story” novella: Words: noun — 747 (30,56%); verb — 696 (28,48%); adjective — 462 (18,90%); adverb — 240 (9,82%); gerund — 109 (4,46%); pronoun — 79 (3,23%); preposition — 42 (1,72%); conjunction — 34 (1,39%); particle — 30 (1,23%); numeral — 4 (0,16%); present participle — 1 (0,04%). Words usage: noun — 2173 (28,88%); verb — 1199 (15,94%); adjective — 855 (11,37%); adverb — 464 (6,17%); gerund — 126 (1,67%); pronoun — 804 (10,69%); preposition — 852 (11,33%); conjunction — 751 (9,98%); particle — 281 (3,74%); numeral — 17 (0,23%); present participle — 1 (0,01%). “No Atonement” novella: Words: noun — 620 (33,97%); verb — 531 (29,10%); adjective — 299 (16,38%); adverb — 138 (7,56%); pronoun — 74 (4,05%); gerund — 58 (3,18%); preposition — 37 (2,03%); conjunction— 31 (1,70%); particle — 31 (1,70%); numeral— 4 (0,22%); exclamation — 2 (0,11%). Words usage: noun — 1329 (26,07%); verb — 852 (16,71%); adjective — 456 (8,94%); adverb — 226 (4,43%); pronoun — 763 (14,97%); gerund — 64 (1,26%); preposition — 637 (12,50%); conjunction — 538 (10,55%); particle — 217 (4,26%); numeral — 14 (0,27%); exclamation — 2 (0,04%). “Flute Solo” novella: Words: noun — 620 (37,62%); verb — 407 (24,70%); adjective — 289 (17,54%); adverb — 134 (8,13%); pronoun — 68 (4,13%); preposition — 38 (2,31%); conjunc- tion — 31 (1,88%); present participle — 30 (1,82%); particle — 23 (1,40%); numeral — 7 (0,42%); exclamation — 1 (0,06%). Words usage: noun — 1188 (27,15%); verb — 695 (15,88%); adjective — 432 (9,87%); adverb — 217 (4,96%); pronoun — 665 (15,20%); preposition — 515 (11,77%); conjunction — 453 (10,35%); gerund — 31 (0,71%); particle — 163 (3,72%); numeral — 16 (0,37%); exclamation — 1 (0,02%). The meanings of the statistical coefficients that characterize the researched novellas presented in the Table 1 below Table 2. Total coefficients of words Novellas Coefficient RI1 RI2 RI3 RI4 Vocabulary richness 0,30 0,32 0,36 0,38 Average word repetition in text 3,36 3,08 2,79 2,66 Exclusivity ratio for word forms 0,33 0,34 0,38 0,38 Exclusivity ratio for words 0,63 0,63 0,66 0,68 Vocabulary concentration coefficient for word forms 0,01 0,01 0,01 0,01 Vocabulary concentration coefficient for words 0,05 0,04 0,04 0,04 Automatic readability in- dex 13,14 26,55 17,69 23,97 Table 3. General text coefficients Novellas Coefficient RI1 RI2 RI3 RI4 Coefficient of lexical density 0,22 0,21 0,23 0,22 Adjectives to nouns ratio 3,24 2,54 2,91 2,75 Adverb to verb ratio 0,24 0,35 0,25 0,30 Nouns to verbs ratio 1,76 1,64 1,45 1,64 Verbs to total number of words ratio (aggressiveness) 0,17 0,18 0,18 0,17 Coefficient of logical connectivity 4,76 9,72 6,99 9,22 Coefficient of speech “embolism 0,05 0,04 0,04 0,04 It is important to mention that the percentage of parts of speech in different words us- ages and words slightly differs. The results are represented on the picture 1 below: Fig. 1. The percentage difference of parts of speech in word usages and words in the text. It should be noted that taking into account the fact that modern grammatical theories consider gerund and present participle as verbs classes, these two parts of speech were merged as verbs [1]. As it can be seen, for parts of the speech such as verb, noun, adjective and adverb, the percentage words decreased (on average: verb – in 0.6, noun – in 0.8, adjective – in 0.6, adverb – in 0.6). But it increased significantly for pronouns (3.7), prepositions (6.0), conjunctions (6.5), particles (3.3). The percentage number of the numerals did not change at all (1.0) while the percentage of pronouns decreased (0.4). The reason is probably to be found in the method of constructing the statements. For further parts of speech analysis of texts, prepositions, conjunctions, and particles were grouped into “auxiliary parts of speech group” while the exclamations and numerals were grouped into the “miscellaneous” group, since in terms of quantity their selection is not big enough to carry out a general statistical analysis described in the paper. The results were compared to the quantitative parts of speech distribution of the Dic- tionary of the Ukrainian language consisting of 11 volumes: Fig. 2. The parts of speech distribution of Roman Ivanychuk’s novellas comparing to the 11 volume the Dictionary of the Ukrainian language The figure 3 below represents the parts of speech distribution for words encountered in the researched novellas. The figure 4 below represents the parts of speech distribution for word usage encountered in the researched novellas. Fig. 3. Parts of speech distribution for words Fig. 4. Parts of speech distribution for word usages The distribution of rank frequencies is shown on the figure 5. It mainly focuses on word forms, although it is important to mentioned that the distribution of rank frequencies for wards is identical as for wordforms. 3500 3000 2500 2000 1500 1000 500 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 RI1 RI2 RI3 RI4 Fig. 5. The distribution of rank frequencies for word forms in the novellas by R. Ivanychuk The frequencies distributions for each of novellas are as follows: • novella “And earth, and green, and song”: Words: 1 — 1636 (62,59%); 2 — 402 (15,38%); 3 — 186 (7,12%); 4 — 103 (3,94%); 5 — 67 (2,56%); 6 — 33 (1,26%); 9 — 22 (0,84%); 7 — 19 (0,73%); 8 — 19 (0,73%); 10 — 16 (0,61%); 12 — 10 (0,38%); 11 — 9 (0,34%); 13 — 7 (0,27%); 16 — 6 (0,23%); 20 — 5 (0,19%); 14 — 4 (0,15%); 15 — 4 (0,15%); 18 — 4 (0,15%); 21 — 4 (0,15%); 24 — 4 (0,15%); 28 — 4 (0,15%); 17 — 3 (0,11%); 19 — 2 (0,08%); 26 — 2 (0,08%); 27 — 2 (0,08%); 31 — 2 (0,08%); 33 — 2 (0,08%); 34 — 2 (0,08%); 38 — 2 (0,08%); 39 — 2 (0,08%); 44 — 2 (0,08%); 67 — 2 (0,08%); 22 — 1 (0,04%); 29 — 1 (0,04%); 36 — 1 (0,04%); 37 — 1 (0,04%); 42 — 1 (0,04%); 45 — 1 (0,04%); 48 — 1 (0,04%); 52 — 1 (0,04%); 55 — 1 (0,04%); 58 — 1 (0,04%); 68 — 1 (0,04%); 69 — 1 (0,04%); 73 — 1 (0,04%); 78 — 1 (0,04%); 80 — 1 (0,04%); 85 — 1 (0,04%); 86 — 1 (0,04%); 92 — 1 (0,04%); 100 — 1 (0,04%); 121 — 1 (0,04%); 123 — 1 (0,04%); 126 — 1 (0,04%); 139 — 1 (0,04%); 142 — 1 (0,04%); 159 — 1 (0,04%); 217 — 1 (0,04%); 254 — 1 (0,04%). Word forms: 1 — 2915 (74,02%); 2 — 501 (12,72%); 3 — 201 (5,10%); 4 — 95 (2,41%); 5 — 49 (1,24%); 6 — 27 (0,69%); 7 — 17 (0,43%); 9 — 17 (0,43%); 8 — 15 (0,38%); 11 — 15 (0,38%); 10 — 13 (0,33%); 12 — 7 (0,18%); 13 — 7 (0,18%); 18 — 6 (0,15%); 14 — 5 (0,13%); 21 — 5 (0,13%); 15 — 4 (0,10%); 26 — 3 (0,08%); 27 — 3 (0,08%); 19 — 2 (0,05%); 20 — 2 (0,05%); 23 — 2 (0,05%); 17 — 1 (0,03%); 28 — 1 (0,03%); 30 — 1 (0,03%); 31 — 1 (0,03%); 32 — 1 (0,03%); 33 — 1 (0,03%); 34 — 1 (0,03%); 37 — 1 (0,03%); 40 — 1 (0,03%); 42 — 1 (0,03%); 44 — 1 (0,03%); 45 — 1 (0,03%); 50 — 1 (0,03%); 51 — 1 (0,03%); 53 — 1 (0,03%); 56 — 1 (0,03%); 59 — 1 (0,03%); 77 — 1 (0,03%); 82 — 1 (0,03%); 83 — 1 (0,03%); 86 — 1 (0,03%); 120 — 1 (0,03%); 126 — 1 (0,03%); 142 — 1 (0,03%); 156 — 1 (0,03%); 191 — 1 (0,03%); 235 — 1 (0,03%). • “Forest story” novella Words: 1 — 1542 (63,09%); 2 — 383 (15,67%); 3 — 169 (6,91%); 4 — 99 (4,05%); 5 — 50 (2,05%); 6 — 36 (1,47%); 7 — 28 (1,15%); 8 — 18 (0,74%); 10 — 12 (0,49%); 11 — 11 (0,45%); 9 — 10 (0,41%); 12 — 10 (0,41%); 14 — 10 (0,41%); 17 — 10 (0,41%); 13 — 9 (0,37%); 15 — 3 (0,12%); 19 — 2 (0,08%); 21 — 2 (0,08%); 24 — 2 (0,08%); 26 — 2 (0,08%); 39 — 2 (0,08%); 41 — 2 (0,08%); 50 — 2 (0,08%); 16 — 1 (0,04%); 18 — 1 (0,04%); 22 — 1 (0,04%); 23 — 1 (0,04%); 25 — 1 (0,04%); 27 — 1 (0,04%); 28 — 1 (0,04%); 29 — 1 (0,04%); 30 — 1 (0,04%); 31 — 1 (0,04%); 32 — 1 (0,04%); 33 — 1 (0,04%); 36 — 1 (0,04%); 37 — 1 (0,04%); 47 — 1 (0,04%); 48 — 1 (0,04%); 52 — 1 (0,04%); 72 — 1 (0,04%); 78 — 1 (0,04%); 86 — 1 (0,04%); 93 — 1 (0,04%); 109 — 1 (0,04%); 119 — 1 (0,04%); 123 — 1 (0,04%); 124 — 1 (0,04%); 125 — 1 (0,04%); 142 — 1 (0,04%); 159 — 1 (0,04%); 172 — 1 (0,04%); 207 — 1 (0,04%). Word forms: 1 — 2570 (74,02%); 2 — 456 (13,13%); 3 — 169 (4,87%); 4 — 80 (2,30%); 5 — 45 (1,30%); 6 — 35 (1,01%); 7 — 16 (0,46%); 8 — 15 (0,43%); 9 — 10 (0,29%); 12 — 9 (0,26%); 13 — 9 (0,26%); 10 — 6 (0,17%); 11 — 6 (0,17%); 14 — 4 (0,12%); 15 — 4 (0,12%); 17 — 4 (0,12%); 16 — 3 (0,09%); 18 — 2 (0,06%); 19 — 2 (0,06%); 23 — 2 (0,06%); 27 — 2 (0,06%); 39 — 2 (0,06%); 21 — 1 (0,03%); 24 — 1 (0,03%); 25 — 1 (0,03%); 28 — 1 (0,03%); 29 — 1 (0,03%); 30 — 1 (0,03%); 32 — 1 (0,03%); 34 — 1 (0,03%); 62 — 1 (0,03%); 63 — 1 (0,03%); 64 — 1 (0,03%); 76 — 1 (0,03%); 81 — 1 (0,03%); 93 — 1 (0,03%); 105 — 1 (0,03%); 117 — 1 (0,03%); 121 — 1 (0,03%); 124 — 1 (0,03%); 134 — 1 (0,03%); 158 — 1 (0,03%); 201 — 1 (0,03%). • “No Atonement” novella Words: 1 — 1213 (66,47%); 2 — 280 (15,34%); 3 — 107 (5,86%); 4 — 53 (2,90%); 5 — 33 (1,81%); 6 — 23 (1,26%); 7 — 17 (0,93%); 8 — 14 (0,77%); 9 — 11 (0,60%); 13 — 9 (0,49%); 10 — 8 (0,44%); 11 — 7 (0,38%); 12 — 5 (0,27%); 15 — 5 (0,27%); 14 — 4 (0,22%); 22 — 4 (0,22%); 16 — 2 (0,11%); 17 — 2 (0,11%); 18 — 2 (0,11%); 20 — 2 (0,11%); 31 — 2 (0,11%); 43 — 2 (0,11%); 54 — 2 (0,11%); 71 — 2 (0,11%); 19 — 1 (0,05%); 21 — 1 (0,05%); 23 — 1 (0,05%); 30 — 1 (0,05%); 36 — 1 (0,05%); 39 — 1 (0,05%); 45 — 1 (0,05%); 46 — 1 (0,05%); 74 — 1 (0,05%); 77 — 1 (0,05%); 83 — 1 (0,05%); 93 — 1 (0,05%); 95 — 1 (0,05%); 99 — 1 (0,05%); 137 — 1 (0,05%); 149 — 1 (0,05%). Word forms: 1 — 1940 (76,98%); 2 — 301 (11,94%); 3 — 100 (3,97%); 4 — 42 (1,67%); 5 — 31 (1,23%); 7 — 21 (0,83%); 6 — 15 (0,60%); 8 — 14 (0,56%); 13 — 8 (0,32%); 9 — 6 (0,24%); 11 — 5 (0,20%); 12 — 4 (0,16%); 15 — 4 (0,16%); 10 — 3 (0,12%); 14 — 3 (0,12%); 70 — 2 (0,08%); 16 — 1 (0,04%); 17 — 1 (0,04%); 18 — 1 (0,04%); 19 — 1 (0,04%); 21 — 1 (0,04%); 23 — 1 (0,04%); 24 — 1 (0,04%); 29 — 1 (0,04%); 30 — 1 (0,04%); 32 — 1 (0,04%); 38 — 1 (0,04%); 40 — 1 (0,04%); 42 — 1 (0,04%); 47 — 1 (0,04%); 53 — 1 (0,04%); 71 — 1 (0,04%); 73 — 1 (0,04%); 92 — 1 (0,04%); 95 — 1 (0,04%); 135 — 1 (0,04%); 136 — 1 (0,04%). • “Flute Solo” novella Words: 1 — 1127 (68,39%); 2 — 230 (13,96%); 3 — 97 (5,89%); 4 — 46 (2,79%); 6 — 31 (1,88%); 5 — 24 (1,46%); 7 — 17 (1,03%); 9 — 7 (0,42%); 12 — 7 (0,42%); 8 — 6 (0,36%); 10 — 6 (0,36%); 14 — 6 (0,36%); 11 — 4 (0,24%); 13 — 3 (0,18%); 15 — 3 (0,18%); 16 — 3 (0,18%); 17 — 3 (0,18%); 18 — 2 (0,12%); 21 — 2 (0,12%); 22 — 2 (0,12%); 25 — 2 (0,12%); 44 — 2 (0,12%); 49 — 2 (0,12%); 88 — 2 (0,12%); 19 — 1 (0,06%); 20 — 1 (0,06%); 23 — 1 (0,06%); 26 — 1 (0,06%); 28 — 1 (0,06%); 43 — 1 (0,06%); 50 — 1 (0,06%); 53 — 1 (0,06%); 58 — 1 (0,06%); 62 — 1 (0,06%); 64 — 1 (0,06%); 89 — 1 (0,06%); 116 — 1 (0,06%); 138 — 1 (0,06%). Word forms: 1 — 1660 (76,22%); 2 — 259 (11,89%); 3 — 98 (4,50%); 4 — 47 (2,16%); 5 — 28 (1,29%); 6 — 13 (0,60%); 8 — 11 (0,51%); 7 — 10 (0,46%); 10 — 8 (0,37%); 12 — 8 (0,37%); 9 — 6 (0,28%); 13 — 4 (0,18%); 11 — 3 (0,14%); 16 — 3 (0,14%); 28 — 2 (0,09%); 49 — 2 (0,09%); 51 — 2 (0,09%); 85 — 2 (0,09%); 14 — 1 (0,05%); 21 — 1 (0,05%); 22 — 1 (0,05%); 23 — 1 (0,05%); 25 — 1 (0,05%); 26 — 1 (0,05%); 36 — 1 (0,05%); 48 — 1 (0,05%); 62 — 1 (0,05%); 70 — 1 (0,05%); 88 — 1 (0,05%); 116 — 1 (0,05%). As it can be seen, words with frequency equal to 1 have been found in 65%-68% of the whole text (figure 6). Regarding the word forms, words with frequency equal to 1 are a bit higher in terms of quantity, and are equal to 73%-76% and 95–96% (figure 7). 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% RI1 RI2 RI3 RI4 1 2 3 4 5 6 7 8 9 10 >10 Fig. 6. Ranks (frequencies) of words for each novella 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% RI1 RI2 RI3 RI4 1 2 3 4 5 6 7 8 9 10 >10 Fig. 7. Ranks (frequencies) of word forms for each novellas The results shown above can help us to assume that the Ukrainian writer Roman Ivanychuk possessed an incredibly rich vocabulary that was indeed reflected in his manner of writing. At the same time the received results allowed to come up with the following statistical coefficients below: Table 4. Words coefficient Novella Coefficient RI1 RI2 RI3 RI4 Vocabulary richness 0,30 0,32 0,36 0,38 Average word repetition in text 3,36 3,08 2,79 2,66 Exclusivity ratio for word forms 0,33 0,34 0,38 0,38 Exclusivity ratio for words 0,63 0,63 0,66 0,68 Vocabulary concentration coefficient for word forms 0,01 0,01 0,01 0,01 Vocabulary concentration coefficient for words 0,05 0,04 0,04 0,04 Automated readability index 13,14 26,55 17,69 23,97 Table 5. Text coefficient Novella Coefficient RI1 RI2 RI3 RI4 0,22 0,21 0,23 0,22 Coefficient of lexical density 0,31 0,39 0,34 0,36 Adjectives to nouns ratio 0,24 0,35 0,25 0,30 Adverb to verb ratio 1,76 1,64 1,45 1,64 Nouns to verbs ratio Verbs to total number of words ra- 0,17 0,18 0,18 0,17 tio (aggressiveness) 1,59 3,24 1,34 1,89 Coefficient of logical connectivity 0,05 0,04 0,04 0,04 Coefficient of speech “embolism” The calculation made in this research show that the analyzed texts by R. Ivanychuk contain the equal number of nouns and verbs as the nouns to verbs ratio is big enough to conclude that all his novellas have a specific idiostyle that is characterized by robust, accurate, and informative account of Ivanychuk’s thoughts on the paper. In terms of linguistics, the noun phrases and substantive groups significantly prevail in his writing. This prove that his writing has “nominative” style which also includes a wide and fre- quent usage of adjectives that specify and describe everything called by nouns. The adjectives to nouns ratio (the number of adjectives per 1 noun) in the texts of the nominal idiostyle also characterizes the highly fiction level of the writing, as adjec- tives in general are main mean of metaphoric expressions of tropes (namely epithets and comparisons). The coefficient of the adjective to nouns ratio of the researched texts is pretty high (0,31-0,39) which means that Roman Ivanychuk used a lot of epithets in his writing. The nominative style of his writing also supports the fact that there is a pretty low verbs to total number of words ratio (aggressiveness). It indicates that the writing style focuses more on how to describe things rather than reflect some actions. It also shows that the writing is emotionally neutral. The presence of high coefficient of logical connectivity (within 1), harmonic connection between auxiliary parts of speech and syntactic constructions demonstrates that the sentences produced by the author tend to be complex and compound that is also a distinctive feature of the nominative idio- style in general. The length of words and sentences in the researched novellas of Roman Ivanychuk is presented in the table below: Table 6. The statistical indicators of the distribution of words length in the novellas Mean Medium Max Min Mean square devi- frequency value value value ation fluctuation RI1 22 1 5 2,8 0,0299 RI2 15 1 5,34 2,93 0,0338 RI3 17 1 5,08 2,93 0,0409 RI4 21 1 5,22 3,01 0,0454 Fig. 8. Average number of the statistical indicators of the distribution of words length in the no- vellas The table below represents the statistical indicators of length of words by R. Ivanychuk comparing to the same statistical indicators of other Ukrainian writers. Table 7. the statistical indicators of length of words by R. Ivanychuk comparing to the same statistical indicators of other Ukrainian writers Mean Mean Relative er- Other Ukrainian writers square value ror deviation А. Головко (A. Holovko) 4,74 0,1 0,03 О. Гончар (O. Honchar) 5,41 0,07 0,02 О. Довженко (O. Dovzhenko) 4,73 0,08 0,03 П. Панч (P. Panch) 5,28 0,29 0,09 М. Стельмах (M. Stelmakh) 5,3 0,16 0,05 Ю. Яновський (Iu. Ianovskui) 5,06 0,13 0,04 Повісті Р. Іваничука 5,15 2,91 0,01 The analysis of the given indicators shows that according to the mean length of words, the novellas of R. Ivanychuk are close to the texts of Iu. Ianovskui and P. Panch. How- ever, this also can reflect the specificity of this statistical indicator. The table below represents the statistical indicators of the distribution of the sentence length in the novellas of R. Ivanychuk. Table 8. The statistical indicators of the distribution of the sentence length in the novellas of R. Ivanychuk Statistical indicator Value received Quantity of different lengths 926 Mean value 30,8 Mean square deviation 31,12 Medium frequency fluctuation 1,0105 Standard error 1,0228 Relative error 0,0651 Fig. 9. The distribution of lengths of sentences of R. Ivanychuk’s novellas in comparison with other genres 5 Conclusions The carried-out research allows to concluded that the Ukrainian author Roman Ivanychuk possessed a special, perhaps unique and definitely interesting and eye-catch- ing matter of writing. Not only his texts and plots are gripping, but the form itself is also very outstanding and out of ordinary for that period of time. First of all, his manner of writing has a nominative style (that is definitely a distinctive feature for his style) where nouns and adjectives significantly prevail over the other parts of speech. This proves that his intention of writing was to describe things, to reflect on the paper how he saw the world around. At the same time his writing was emotionally reserved. More- over, Roman Ivanychuk tended to use large sentences to describe his ideas and thoughts. The length of sentences in his writings if probably the larger one (or among the largest ones) in the Ukrainian prose. Additionally, it has to be mentioned that the level of statistical researches of the Ukrainian fiction is general is still evolving. The methods of research used this far are obsolete and need to be updated, the size of the selections for researches are generally small and need to be enlarged (which will provide wider and more accurate results). Nowadays it is common to use symbols to measure the length of words and words – to measure the length of sentences. However, it is also possible to measure the length of sentences, passages, and even whole texts in symbols and words can be widely used for measuring the length of passages, chapters, etc. In my research I decided to use the above described approach, although did not in- clude all of the results in the paper as without presentation in comparison with other Ukrainian writers, these results are rare and does not provide much value this far. So, this is the intention to continue the research in this direction, research other writers and compare Ivanychuk’s manner of writing with theirs. Work definitely must go on and it will. 6 References 1. Aleksienko, L., Zuban, O., Kozlemkozh, I.: Suchasna ukraiinska mova. Znannia, Kyiv, 534 p (2013). 2. Buk, S.: Kilkisne zistavlennia tekstiv (na materiali redaktsii 1884 ta 1907 rokiv povisti Ivana Franka “BOA CONSTRICTOR”). Ukrainske literatyroznavstvo, 76, pp. 179-192 (2012). 3. Ferdinand de Saussure.: Kurs obshchei lingvistiky. Trudy po iazykozhanyiu, Moskwa, 269 p (1977). 4. Ivanychuk, R.: I zemlia, I zelo, I pisnia (eng. And earth, and green, and song). pp. 6-35 Sribne slovo. Lviv (2006). 5. Ivanychuk, R.: Lisova povist (eng. Forest story). Sribne slovo. Lviv, pp. 116-139 (2006). 6. Ivanychuk, R.: Nespokutne (eng. No Atonement). Sribne slovo. Lviv, pp. 106-115 (2006). 7. Ivanychuk, R.: Solo na fleiti (eng. Flute Solo). Sribne slovo. Lviv, pp. 86-104 (2006). 8. Kamińska-Szmaj, I.: Części mowy w słowniku i tekście pięciu stylów funkcjonalnych pol- szczyzny pisanej (na materiale słownika frekwencyjnego). Biuletyn Polskiego Towarzystwa Językoznawczeg, XLI, pp. 127–136 (1988). 9. Kulchytskyi, I.: Technolohichni apekty ukladannia korpusiv tekstiv. Monographia spilno z V., Shevchenko I., Zahnitko A. ta in. za redaktsiieiu Levchenko O. Vydavnytstvo Lvivska Politechnika, pp. 29-45 (2015). 10. Lawson, B., Sharp, R.: Introducing HTML5. Second Edition New Riders, CA, pp. 295 (2012). 11. Levytshkyi, V.: Kvantytatyvnoe metody v lynhvystyke. Ruta, Chernivtsi, p.190 (2004). 12. Ohorodnyk, V.: Kilkisnyi rozpodil rechen I slovoform u tvorakh Romana Ivanychuka. XIII Vseukrainska naukovo-metodychna konferentsiia molodykh naukovtsiv, Mykolaiv, 84 p (2018). 13. Ruszkowski, M.: Wskaźnik epitetyzacji w badaniach stylistycznych. Respectus Philologi- cus, № 5(10), pp. 48–53 (2004). 14. Starko, V.: Ukrainska: dykh is bukva v tsyphri, https://zbruc.eu/node/87161, last accessed 2019/12/26. 15. Ukrainskyi pravopus, https://mon.gov.ua/ua/osvita/zagalna-serednya-osvita/navchalni-pro- grami/ukrayinskij-pravopis-2019, last accessed 2019/12/26.