=Paper=
{{Paper
|id=Vol-2667/paper9
|storemode=property
|title=Improvement of the algorithm of automated definition of rhyme
|pdfUrl=https://ceur-ws.org/Vol-2667/paper9.pdf
|volume=Vol-2667
|authors=Vladimir Barakhnin,Olga Kozhemyakina,Ilya Pastushkov,Irina Kuznetsova,Yulia Borzilova
}}
==Improvement of the algorithm of automated definition of rhyme ==
Improvement of the algorithm of automated definition of rhyme Vladimir Barakhnin Olga Kozhemyakina Ilya Pastushkov Federal Research Center for Federal Research Center for Federal Research Center for Information and Computational Information and Computational Information and Computational Technologies Technologies Technologies Novosibirsk, Russia Novosibirsk, Russia Novosibirsk, Russia bar@ict.nsc.ru ORCID: 0000-0003-3619-1120 ORCID: 0000-0002-0341-7931 Irina Kuznetsova Yulia Borzilova Federal Research Center for Federal Research Center for Information and Computational Information and Computational Technologies Technologies Novosibirsk, Russia Novosibirsk, Russia ORCID: 0000-0002-6890-1636 ORCID: 0000-0002-8265-9356 Abstract—The paper considers approaches to the authors of the study [3] conducted an experiment for improvement of one of the steps of the algorithm used for the languages similar in structure, which ended unsuccessfully automated determination of rhyme in poetic texts. The due to the specifics of each of the languages considered by automated rhyme detection tool is one of the modules of the the authors. system of complex analysis of poetic texts. In the current module implementation, the rhyme search and definition The problem of analyzing the metrorhythm of poetic subtask are solved by finding words with consonant endings texts for each language (or a group of the similar languages) using the A. A. Zaliznyak Grammar Dictionary of the Russian is obtained differently. Next, we will consider some of the Language and the basic rules of phonetic analysis. Alternative projects of the authors who solve the indicated problem for solutions to the search problem in the dictionary of words with different languages. consonant endings are proposed. The results obtained will allow us to conclude that the current implementation is D. Fusi in studies [4, 5] introduced the Chiron system, optimistic and the methods used can be finalized to solve the which allows analyze with several languages (Latin and problems of determining the rhyme of a poetic text. Greek). The system is built at a level of abstraction in which it is possible to work with several different languages, meters Keywords—analysis of poetic texts, metrorhythmic analysis, and texts. The developed system have a modular structure, rhyme identification each module interacts with the next one by data transfering (in a predetermined format). The higher the level of the I. INTRODUCTION hierarchical chain, the more abstract analysis is performed by One of the tasks of the automated complex analysis of this component. Hierarchy levels in the system: poetic texts [1] is to determine the characteristics which are phonetics and prosody; related with the metrorhythmics of a poem. Among the works where the statistical information extracted from the appositives and clique; poetic text was used for the solving of philological problems, we can mention the study [2], which, despite compiling it metric scan. manually, presents a rather comprehensive statistical picture The author [4, 5] does not mention the accuracy statistic of the metrorhythmics of Pushkin’s works, what allows the in determining each level (phonetics, clitics, metrics), but it authors to find the patterns inherent to Pushkin’s rhyme. The can be assumed that the accuracy is not the maximum. The modern information technologies make possible to conduct author emphasized that the developed system (as well as such studies, if not completely automatically, then with similar ones) does not imply a complete replacement of the minimal usage of the work of expert-philologists. expert; the main task is to provide researchers with data The problem solution of automated analysis of poetic whose processing costs occupy a significant share of human texts requires the adaptation to various languages. The resources. different approaches are caused by both the specifics of the B. Navarro proposed a tool that studies the metrics of language (in particular, the features of the construction of Spanish sonnets and performs semantic analysis of poems [6, poetic texts) and the tools used by researchers. The toolkit, in 7]. Currently, this system is applied to a corpus of 5078 turn, depends on the goals set by the researchers (for sonnets of the XVI and XVII centuries. The corpus is example, to obtain the confirmation of any regularity in the converted to the TEI format 1 ; the sequence of characters structure of the poetic text), and, to some extent, on from one poem without additional marking is input to the technologies that were relevant at the time of the study. system. A rule-based module performs separating syllables: As for the linguistic versatility of instruments, it is an external grammar marking system is used. If the syllabic impossible to develop a system for automated analysis of partition produces 10 metric syllables, then the system meter and rhythm, designed for a wide range of languages. considers that the scan is complete. For non-standard Moreover, the insoluble task is to develop a metrorhythmic situations, a number of rules applied (a detailed description analysis system suitable for at least a group of related of the rules is not given by the authors of the project). languages — each language requires the development of its own approaches that take into account its structure. The 1 TEI: Text Encoding Initiative: https://tei-c.org/ Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0) Data Science The system [3], mentioned previously, is a tool for the line numbering of the poem; complete analysis of poetic texts in Portuguese. The system input is a poems in XML format and it scan each poem tokenization of words; independent of other poems in the corpus. Includes the accentuation of the poem; following steps: selection of rhymed lines; text preprocessing (conversion to XML format); syllabic determination. extract words from a poem; The authors2 developed an open network resource, which finding a stressed syllable; is represented by the components: the problem-oriented division into syllables; “Poetology Thesaurus” and the “Block of Analysis and Specification” of the text objects. In the “analysis and phonetic transcription forming (using an independent specification” block, two sets of tasks are identified [15]: a dictionary); specification of terminological articles of a thesaurus and a specification of a poem. The structure of the complex selection of transcription options for each poem includes groups of solutions of the problems: (determination of the rhythmic scheme); metrorhythmic marking of the text; an attempt to determine the metric of a poem; filling of the fields of the specification of the poem; search for matching metrics based on the most appropriate rhythmic scheme; meter identification. splitting a work into syllables according to metric. Among the tools that execute metrorhythmic analysis, web resources 3 , 4 are of interest. The first of them, The results of the analysis [3] showed a high percentage Rifmoved.ru, is positioned as an supporting tool for the of accuracy (95–98%), however, for other languages (similar analysis of poetic text, which determines the stanza and the in structure), the experiment on the analysis of poetic texts forms (sonnet, sextine). The algorithms were developed on ended unsuccessfully due to the specifics of individual the basis of the author’s concept of program poetry analysis languages. by V. Onufriev, however, a theoretical description of these M. Agirrezabal et al. [8] developed the ZeuScansion algorithms was not found in scientific sources. The authors system, which performs syntax analysis for English poems. of the resource indicate that the work of the algorithms is The system uses dictionaries to determine the stressed designed to analyze poetic texts written in traditional forms, syllable in a word. By combining words to form the stress classical stanzas and sizes. This fact greatly limits the usage pattern of the whole poem, the system also performs syntax of the tool for large corpuses of texts of poets who are not analysis, followed by a series of rules. If the word was not related to classical literature. found in the dictionary, the program searches and uses the The second resource, the Neogranka.ru, obviously, is an nearest word in heuristics. amateur web portal for determining the poetic size, R. Ibrahim and P. Plecháč [9, 10] developed the KVĚTA generating new poems and selecting rhymes. When a user system, the purpose of which is to analyze Czech poems. The tries to determine the verse size, the service clarifies all system got a poem as input, the words of which should controversial situations (accentuation options), what takes a contain morphological marking. KVĚTA applies a series of lot of user time. There is also no theoretical description of rules to poems that transform a poetic text into a phonetic the algorithms used in available sources. transcription; if the rules cannot be applied, a dictionary is It is important to note that almost all the algorithms applying. The system compares the patterns found in the mentioned above are aimed to study relatively small text poems with the generated variations. Initially, the idea of a corps covering the work of one or more authors, therefore, metric index was used [11]. Later, the authors used a metric the speed of rhyme determination algorithms is not a critical coefficient using some others parameters, which allowed to parameter. However, in the research conducted at the Federal increase the accuracy from 94.88% to 95.94%. Research Center for Information and Computational A number of works are known devoted to the analysis of Technologies (FRC ICT), it is planned to study the versification for Arabic and similar languages. A. Kurt and interdependence of the phonometric and lexical-thematic M. Kara [12] proposed an algorithm for recognizing and levels of poetic texts with the aim of identifying and analyzing poems written in a special, typical for eastern measuring the relationships of semantic associations (Arabic, Persian, Turkish) poetry, versification system described on the basis of semantic fields with poetic sizes; “arud”. M. A. Alnagdawi described a method for finding the so-called textures that take into account the construction poetic metrics using context-free grammars [13]. A. and metrorhythmics (a detailed statement of the problem Almuhareb et al. [14] described some methods for defining described in [17]). One of the main difficulties in solving poetic patterns in Arabic for extracting verses. problem mensioned above is the need to analyze corpus of poetic texts of a large volume, as a result of which the task of For the Russian language, a number of solutions during optimizing rhyme search algorithms from the point of view metrorhythmic analysis problem are also proposed. In study of time spending becomes necessary. As usual, these [15], the automatic procedures for specifying a poetic text — algorithms use multiple queries to databases containing metrorhythmic marking and identification of a verse meter — were considered. The automation of metrorhythmic 2 Wikipoetics: http://wikipoetics.ru/ marking is achieved by using the following procedures [16]: 3 Rifmoved.ru. http://rifmoved.ru/analiz_stihov.htm 4 Neogranka.ru. http://neogranka.ru/razmer_stiha.html VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020) 37 Data Science phonetic transcription of words, so we are faced with the task III. THE RESEARCH PROCESS of optimizing such SQL queries. Note that this task is becoming actual in all areas of scientific research working A. Data preparation with Big data: from business analytics to the analysis of One of the options for the search implementation is the Earth remote sensing data (for example, [18, 19]). partition of the source table with words into sections. The version of the PostgreSQL database deployed on the FRC II. THE PROBLEM STATEMENT ICT server supports simple partitioning: the splitting of one A web application has been developed at FRC ICT 5 , large logical table into several small physical sections7. The which is used to analyze the structural level of Russian- benefits of the usage of sections: language poetic texts. The algorithms are described in [20], When queries or updates access a large percentage of they does not involve a work with complex cases of analysis a single partition, the performance can be improved of poetic texts, therefore, in [21], the implementation of the by taking advantage of sequential scan of that algorithms from [15] was proposed, what includes a more partition instead of using an index and random rigorous classification of poems by meter. But in the access reads scattered across the whole table. algorithm for determining the rhyme from [15], the authors use the web-based application “Big Rhyme Dictionary”6: the The bulk loads and deletes can be accomplished by application receives a word, the output returns the full set of adding or removing partitions, if that requirement is words rhyming with it (out of context). However, this planned into the partitioning design. ALTER TABLE approach takes a lot of time and resources, therefore, the NO INHERIT and DROP TABLE are both far faster rhyme search algorithm [21] is implemented for reasons of than a bulk operation. These commands also entirely the possibility of rhyme creation: the lines rhyme if the last avoid the VACUUM overhead caused by a bulk words in the line have the same position of the stressed DELETE. syllable and the endings phonetically match. Seldom-used data can be migrated to cheaper and To identify the phonetically matching endings, the data slower storage media. about endings from article [22] are used. The algorithm request a word into a table with words aggregated from A.A. PostgreSQL supports partitioning via range partitioning Zaliznyak’s dictionary [23], implemented in a standard way. (for example, one might partition by date ranges) and list partitioning − the table is partitioned by explicitly listing The purpose of the study is to find for alternative which key values appear in each partition. In this study, the approaches to search for rhyming lines in the database and list partitioning is used, where the ends of the dictionary conduct a series of experiments to find out the most effective words are indicated as key values. method for usage in the algorithm. The proposed solution: To create a list of sections in form of tables, a Python 1) To build a table with inverted rows sorted script is used that operates by the following algorithm: lexicographically. 1) The request to a table with words. 2) To separate all words into sections by ranges after 2) The selection of the N-last characters from the word. endings. In other words, the store endings (inverted) 3) The formation of an array of all dictionary endings. separately from the word (as metadata). 4) The counting and sorting the usage of each ending in 3) To add the trigram symbolic indexes to the original descending order. table with all the words. 5) The separating of M-first endings from the array, on 4) To perform an experiment with the aim of find the basis of which the sectioning will be performed. rhyming words using sections (search only endings) and As the last N characters, four characters are taken, this trigram indexes. value can be changed in the future. To build a sorted 5) To compare the performance of a section search dictionary, we use the collections module of the Counter option using indexes or a combination of these options. library8. The result is a dictionary of the following structure: To test the hypothesis about the effectiveness of application of the trigram symbolic indexes, it was decided Counter({'НОГО': 86077, 'ЕЙСЯ': 76978, 'ВШЕЙ': to conduct an additional experiment to measure the 76400, 'ИМСЯ': 62934, 'ШЕГО': 61719, 'ГОСЯ': 57630, performance of SQL queries using the indexes in the search 'ИХСЯ': 57617, 'НОМУ': 57354, 'ВШИМ': 57282 ...}) module of the complex analysis of poetic texts. This module In the received dictionary, the key is the desired ending solves the problem of searching for low-level characteristics (last N characters), and the value is the number of (for example, metrorhythmic statistics) and high-level occurrences of this ending. It was decided to isolate the characteristics (for example, genre-style affiliation). When a values of endings with coefficients included in the 90th search query is done, SQL queries to the database are percentile from the created dictionary. These endings were generated, some of which include a search by values of the used to create the sections. varchar and text type. The execution time of such queries can be reduced by using the symbolic indexes. The process of the creation of partitioned tables includes the following steps: 7 PostgreSQL: Documentation 9.4: Partitioning. https://www.postgresql.org/docs/9.4/ddl-partitioning.html 5 Analysis of poetic texts online. http://poem.ict.nsc.ru/ 8 Collections — Container datatypes. 6 Big Rhyme Dictionary. http://rifmovnik.ru/docs.htm https://docs.python.org/3.7/library/collections.html VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020) 38 Data Science 1) To create a parent table whose properties inherit all natural languages, as well as for solving related problems, the child tables (sections). The parent table is a table with such as, for example, fuzzy search (search by similarity). words from the dictionary of A. A. Zaliznyak [23] with the PostgreSQL supports two types of text indexes 10 : GIN structure: (Generalized Inverted Index) and GIST (Generalized Search a) identifier; Tree), which provide a work with symbol trigrams, what is b) word; prerequisite for using GIN, which operates the lexemes by defaults. Despite the fact that the GIN by description is very c) accentuation (the number of the syllable to which similar to the experiment with inverted strings, GIST also has the stress falls); a basis for application: its tree-like structure increases the d) word type; completeness of search results by including inaccurate hits, As an additional column, the endings (last N characters) which is quite suitable for the rhyme search task, since the of each word in inverted order are added. table from the work of V.M. Zhirmunsky [22] contains, inter alia, the pairs of endings that do not coincide in spelling. 2) To create the child tables with inheritance of parent structure. In these tables there will be no additional columns As part of the search module for a comprehensive except legacy ones. All child tables will be called the analysis of poetic texts, it is possible to add text indexes to sections. solve the following problems: 3) To add the restrictions to the section tables that search by accentuation mask; define the valid key values for each section. The restrictions search by words from the name and text. do not overlap — no key values apply to several sections at once. The corpus of Pushkin’s works was loaded to the 4) To create a key column index for each section. In this database; the main tables with the texts of works and their study, the indexes were created for the “word” column. metadata contain data with a volume of more than 700 rows. 5) To define a trigger to redirect data added to the main The search query includes not only the direct solution of the table to the corresponding section. Created trigger is work above problems, but is also adapted for the user to understand: the response array includes additional entries, out when SQL command INSERT is run. such as the author’s full name and title of the poem, i.e. the The created trigger launches a function that adds values request is composite. During the experiment, the query to the corresponding section (table). Fragment of the runtime of processing additional parts of the request are not function: taken into account. CREATE OR REPLACE FUNCTION B. Experiments words_with_reversed_endings_insert_function() It is supposed to conduct the following experiments with RETURNS TRIGGER AS $$ the search for rhymes in corpuses of the PostgreSQL BEGIN database: IF (NEW.ending = 'нии') THEN 1) To search for the desired ending among the section names: SELECT * from pg_catalog.pg_tables where INSERT INTO words_with_endings_nii %section name conditions%. It is worth noting that only in (id, word_form, ending, accent, word_type) this experiment the previously inverted lines described VALUES (NEW.id, NEW.word_form, above are used. reverse(NEW.ending), NEW.accent, NEW.word_type); 2) To search the endings by the incomplete match of ELSIF (NEW.ending = 'ний') THEN LIKE on a table without indexes. 3) To search the endings by the incomplete match of INSERT INTO words_with_endings_niy LIKE on the table with the constructed GIN index by symbol (id, word_form, ending, accent, word_type) trigrams: CREATE INDEX trgm_idx ON test_trgm USING VALUES (NEW.id, NEW.word_form, GIN (t gin_trgm_ops); reverse(NEW.ending), NEW.accent, NEW.word_type); 4) To search the endings by the incomplete match of The creation of tables, indexes, trigger and function is LIKE on the table with the constructed GIST index by performed through a Python script in an automated mode. symbol trigrams: CREATE INDEX trgm_idx ON test_trgm The manual adjustment of table and index names is required, USING GIST (t gist_trgm_ops). since transliterated ending names were used for naming — For conducting the experiment, the smallest possible some cases required the manual intervention (transliterate9 is sample of 100 examples of endings was taken; 80% of the used). These cases include, for example, the coincidence of sample consisted of randomly selected the most frequently names during transliteration of the endings “ЕМСЯ” and used endings (the first 500 one), the remaining 20% were “ЁМСЯ”. examples from the following 100 used endings (also randomly selected). The time spent on experiments were In the context of a PostgreSQL database, a trigram is a measured for each of the options (1)–(4). During each group of three consecutive characters. We can measure the experiment, the characteristics are received (the similarity of the two lines by counting the number of abbreviations are indicated in brackets): matching trigrams. This simple idea turns out to be very effective for measuring the similarity of words in many 10 Postgres Pro Standard. 9 Transliterate – PyPi. https://pypi.org/project/transliterate/ https://postgrespro.ru/docs/postgrespro/9.5/textsearch-indexes VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020) 39 Data Science average SQL query runtime (avg); LEFT JOIN 50th percentile (median); "MRSTATISTICS" m ON m."POEM_ID" = p."ID" 90th percentile (90 perc); WHERE 95th percentile (95 perc); m."ACCENTUATION_MASK" LIKE 'cC cC cC c' 98th percentile (98 perc). The query runtime was measured with different variants of the search conditions (for example, a search for a different The results are shown in table I. number of words); the result was an average score of 10 queries with GIST and GIN indexes. The results of the TABLE I. THE RESULTS OF THE EXPERIMENT experiment are shown in table II. Avg Median 90 perc 95 perc 98 perc TABLE II. THE RESULTS OF AN EXPERIMENT TO ADD INDEXES TO A Search among 0.798 0.831 1.704 1.942 2.381 SEARCH MODULE section names Without Index name Avg query runtime, msec 2.497 2.364 3.450 4.109 4.193 indexes without GIST GIN GIN 2.258 2.130 2.974 3.343 3.645 indexes Search by accentuation mask 99 117 116 GIST 2.407 2.392 3.179 3.330 3.607 Search by words from the It is possible to make a number of conclusions: 129 146 144 name and text The least time-consuming option turned out to be (1), suggesting a search among section names. This The results of an additional experiment showed an indicator is partly conditioned by those endings for increase in the time for processing queries for text values which the sections were not created — in such cases, used in the SQL query. Such an increase can be caused either the cost of executing the SQL query was negligible. by insufficient test data, or by the inefficiency of the applied indexes within the framework of the problem being solved. Search results without indexes and searches using the GIST index differ slightly from each other, what IV. CONCLUSIONS indicates the inappropriateness of using the GIST The usage of PostgreSQL built-in database tools has long index to solve the research problem. been limited by search engines in their modern Satisfactory results showed the usage of the GIN understanding, the results were returned on request in a index to search for incomplete matches (3). natural language using a DBMS (Database Management System). For the task of rhyme search, the program An additional experiment on measuring the time which is performance is not a determining factor. In the present work, spent for searching by the accentuation mask or by words the most prospective approaches were shown, as well as the from the poems consists in the formation of search queries examples on how to significantly speed up the algorithm and comparison their effectiveness. Trigram symbolic using simple steps, what allows other researchers to apply indexes GIST and GIN affected in the query are added these approaches as part of their research without requiring separately to the text fields of the tables, namely the fields expert knowledge of the PostgreSQL database. In addition, “Mask of accentuation” and “Text of the poem”; at the first the interface to access the DBMS does not change (except stage of the experiment, the query runtime without indexes for the manual construction of a table with inverted rows), was measured. Types of executed requests: what is convenient for developers who integrate the text analysis systems with the PostgreSQL database. without indexes; using GIST index (the operator class gist_trgm_ops ACKNOWLEDGMENT was used); The study was carried with the support of the Russian Science Foundation (project No. 19-18-00466). using GIN index (the operator class gin_trgm_ops was used). REFERENCES [1] V. Barakhnin and O. Kozhemyakina, “About the automation of the A fragment of a typical SQL query for which runtime complex analysis of russian poetic text,” CEUR Workshop was measured: Proceedings, vol. 934, pp. 167-171, 2012. SELECT [2] N.V. Lapshina, I.K. Romanovich and V.I. Yarkho, “Metrical Handbook for Pushkin’s poems,” M., L.: Academia, 1934. a."ID" as AUTHOR_ID, [3] A. Mittmann, “Escansão automático de versos em português,” Universidade Federal de Santa Catarina, 2016. a."LASTNAME", a."FIRSTNAME", [4] D. Fusi, “An Expert System for the Classical Languages: Metrical Analysis Components,” Lexis, vol. 27, pp. 25-45, 2008. a."MIDDLENAME", [5] D. Fusi, “A Multilanguage, Modular Framework for Metrical Analysis: IT Patterns and Theorical Issues,” Langages, vol. 3, no. p."ID" as POEM_ID, p."NAME" as POEM_NAME, 199, pp. 41-66, 2015. [6] B. Navarro, “A Computational Linguistic Approach to Spanish m."ACCENTUATION_MASK" FROM "AUTHOR" Golden Age Sonnets: Metrical and Semantic Aspects,” Proceedings of a the Fourth Workshop on Computational Linguistics for Literature, pp. 105-113, 2015. LEFT JOIN [7] B. Navarro, M.R. Lafoz and N. Sánchez, “Metrical Annotation of a "POEM" p ON p."AUTHOR_ID" = a."ID" Large Corpus of Spanish Sonnets: Representation, Scansion and VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020) 40 Data Science Evaluation,” Proceedings of the Tenth International Conference on [16] I. Pilshchikov, and A. Starostin, “Reconnaissance automatique des Language Resources and Evaluation (LREC), pp. 4360-4364, 2016. mètres des vers russes: Une approche statistique sur corpus,” [8] M. Agirrezabal, “ZeuScansion: a Tool for Scansion of English Langages, vol. 3, no. 199, pp. 89-106, 2015. Poetry,” Journal of Language Modelling, vol. 4, no. 1, pp. 3-28, 2016. [17] K. Taranovsky, “About the relationship between poetic rhythm and [9] R. Ibrahim and P. Plecháč, “Towards the Automatic Analysis of topic,” About poetry and poetics, Moscow: Languages of Russian Czech Verse,” Formal Methods in Poetics, Lüdenscheid: RAM- culture, pp. 372-403, 2000. Verlag, pp. 295-305, 2011. [18] N.I. Golov and L. Ronnback, “SQL query optimization for highly [10] P. Plecháč, “Czech Verse Processing System KVĚTA — Phonetic normalized Big Data,” Business Informatics, vol. 33, no. 3, pp. 7-14, and Metrical Components,” Glottotheory, vol. 7, no. 2, 2016. 2015. [11] I. Pilshchikov, and A. Starostin, “The problems of automation of [19] L.I. Lebedev, Yu.V. Yasakov, T.Sh. Utesheva, V.P. Gromov, A.V. basic procedures rhythmic parsing accentual-syllabic texts,” Russian Borusjak and V.E. Turlapov, “Complex analysis and monitoring of National Corpus: 2006-2008: New results and perspective, pp. 298- the environment based on Earth sensing data,” Computer Optics, vol. 315, 2009. 43, no. 2, pp. 282-295, 2019. DOI: 10.18287/2412-6179-2019-43-2- [12] A. Kurt and M. Kara, “An algorithm for the detection and analysis of 282-295. arud meter in Diwan poetry,” Turkish journal of electrical engineering [20] V.B. Barakhnin, O.Y. Kozhemyakina and A.V. Zabaykin, “The & computer sciences, vol. 20, no. 6, pp. 948-963, 2012. Algorithms of Complex Analysis of Russian Poetic Texts for the [13] M.A. Alnagdawi, H. Rashideh and A.F. Aburumman, “Finding Purpose of Automation of the Process of Creation of Metric Arabic Poem Meter using Context Free Grammar,” Journal of Reference Books and Concordances,” CEUR Workshop Proceedings, Communications and Computer Engineering, vol. 3, no. 1, pp. 52-59, vol. 1536, pp. 138-143, 2015. 2013. [21] V.B. Barakhnin, O.Yu. Kozhemyakina and I.V. Kuznetsova, [14] A. Almuhareb, “Recognition of Classical Arabic Poems,” “Development and Implementation of the Algorithm for Automatic Proceedings of the Workshop on Computational Linguistics for Analysis of Metrorhythmic Characteristics of Russian Poetic Texts,” Literature, pp. 9-16, 2013. CEUR Workshop Proceedings, vol. 2523, pp. 290-298, 2019. [15] V.N. Boikov, M.S. Karyaeva, V.A. Sokolov and I.A. Pilshchikov, [22] V.M. Zhirmunsky, “Rhyme, its history and theory,” Petrograd: “On an Automatic Procedure for the Specification of a Poetic Text for Academia, 1923. an Open Information-Analytical System,” CEUR Workshop [23] A.A. Zaliznyak, “Grammatical dictionary of the Russian language. Proceedings, vol. 1536, pp. 144-151, 2015. The changing word forms: about 10,000 words,” M.: Russian language, 1980. VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020) 41