=Paper=
{{Paper
|id=Vol-2852/paper4
|storemode=property
|title=A Method for Assessment of Text Complexity Based on Knowledge Graphs
|pdfUrl=https://ceur-ws.org/Vol-2852/paper4.pdf
|volume=Vol-2852
|authors=Vladimir Ivanov,Marina Solnyshkina
}}
==A Method for Assessment of Text Complexity Based on Knowledge Graphs==
A Method for Assessment of Text Complexity Based on Knowledge Graphs Vladimir Ivanova , Marina Solnyshkinab a Innopolis University, Innopolis, Russian Federation b Kazan Federal University, Kazan, Russian Federation Abstract The study explores the problem of assessing text complexity. In this paper we focus on measuring con- ceptual complexity and propose using knowledge graphs to this end. On the first stage of the research, RuThes-Lite thesaurus, a linguistic knowledge base with a total size of over 100,000 text entries (words and collocations), was used to elicit concepts in the texts of schoolbooks and represent text fragments as graphs. In the second series of experiments, we assessed complexity of English texts using knowl- edge graphs WordNet and Wikidata. Finally, we identified graph-based semantic characteristics of texts impacting complexity. The most significant research findings include identification of statistically sig- nificant correlations of the selected features, such as node degree, number of connected nodes, average shortest path, with text complexity. Keywords text complexity, thesaurus, knowledge graphs 1. Introduction Of the three generally accepted levels of text complexity, i.e., lexical, syntactic and seman- tic/informational/conceptual, the third one is evidently more intricate to scrutinize and is uni- versally recognized as the least explored [1]. It is the semantic level of a text, defined as the amount of background knowledge required to comprehend a text, that to a great extend fa- cilitates text comprehension. Automatic measurement of lexical and syntactic complexity of Russian texts has been proposed in a number of studies [2, 3, 4] and it is not unexpected as these two levels are easier to formalize. The predominant approaches in previous work on Russian text complexity combine lexical and syntactic features [5]. As for the influence of the semantic level on text complexity, the studies conducted are still few and mostly for English. In this article, we are developing a new approach to defining the conceptual complexity of texts through knowledge databases such as WordNet. The conceptual complexity of the text is viewed as the amount of knowledge (in particular, from the thesaurus) necessary for understanding a text. Comprehension of conceptually complexity texts requires substantial background knowledge as well as knowledge of abstract notions. Evaluation of the proposed approach implies using a set of texts of different conceptual complexity. One possible way to solve the problem is to abridge original texts and use the Proceedings of the Linguistic Forum 2020: Language and Artificial Intelligence, November 12-14, 2020, Moscow, Russia " nomemm@gmail.com (V. Ivanov); mesoln@yandex.ru (M. Solnyshkina) 0000-0003-3289-8188 (V. Ivanov); 0000-0003-1885-3039 (M. Solnyshkina) © 2020 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) CEUR http://ceur-ws.org Workshop ISSN 1613-0073 Proceedings abridged versions as part of the Corpus. This idea has been implemented in [6]. However, it seems important to assess the method on authentic, not artificial texts. A natural example of this kind of a set of texts of different conceptual (not just formal, lexical and syntactic) complexity are school textbooks of different grades which we use as the material in the current study. An adequate conceptual level of a text complexity is especially important for readers with an insufficient level of knowledge [7, 8], in particular, for schoolchildren. In the situations when schoolchildren experience a lack of necessary knowledge, it may lead to difficulties in text comprehension. Thus, while developing educational materials for a specific audience, a learning material designer is expected to be aware of the approximated amount of theoretical and practical knowledge which the target reading audience can employ. For this purpose, we implement a corpus of school textbooks as a reference corpus. 2. Related works 2.1. Analysis of conceptual text complexity A deeper level of semantic analysis, also referred to as conceptual analysis [6], implies tak- ing into account semantic and pragmatic links between concepts in a text. Although it is the conceptual level that presents a real complexity of a text, so far, feasibility and methods of measuring the conceptual level of text complexity have remained unexplored. The notion of the conceptual complexity of a text is viewed as related to the number of abstract concepts verbalized in a text, i.e., abstract words incidence, in a text [1]. The correlation between text abstractness and “linguistic complexity” was convincingly proved in [9] where the author used Russian texts as the material for the study. In a similar study, A. Laposhina [10] found out that only one of the four groups of text lexis which she identified with the help of ABBYY COMPRENO, i.e., words denoting abstract concepts, could be used as an indicator of text complexity. The groups A. Laposhina classified included the following: (1) ’lex_physical’, i.e. nouns denoting specific material objects, including people (e.g. ’cutlet’, ’table’, ’mom’); (2) ’lex_virtual’, i.e. virtual, intangible objects, e.g. ’base’, ’internet’; (3) ’lex_abstract’, words denoting abstract concepts including terms (e.g. ’avantgarde’, ’whim’, ’ effacement ’), and (4) ’lex_substance’, names of substances, e.g .’silver’, ’vinegar’. Elements of the semantic approach, similar to that in the afore-cited paper, are implemented in [11] in which the authors apply latent semantic analysis to determine semantic proximity of text fragments. However, in most studies published, researchers analyze not a corpus or a text as a whole, but adjacent sentences or paragraphs only. An important component of text complexity is also text cohesion that has been investigated in a number of studies [12, 11]. In [11], the notion of cohesion is defined as a concept uniting referential cohesion and deep cohesion. The first indicates how concepts in a sentence or adjacent sentences overlap and is manifested in repeated words, stems, arguments, etc. The second establishes cohesion due to the sequence of tenses, use of subordinate and connecting conjunctions and other means. In [13], cohesion is viewed as a notion verbalized with lexical chains. The latter refer to a number of adjacent words with similar meanings. Thornbury (2005) emphasizes the importance of lexical chains for maintaining text cohesion arguing that “The lexical connectors include repetition and the lexical chaining of words that share similar meaning” [13]. The author also provides an example of a set of isolated sentences that have matching grammatical categories, but do not form a coherent text: “The university has got a park. It has got a modern tram system. He has got a swimming pool.” As for semantic similarity of words, there are two different approaches to quantifying the extent to which words have similar meanings. The first approach (Latent Semantic Analysis) uses statistical characteristics of words in a text: frequency, co-occurrence. It was successfully used in a number of papers on text complexity, however, as noted above, mainly for local analysis. Another possible approach is to explore semantic relations between words in a text. To this end, a researcher can use the information presented in semantic networks. The theory of semantic networks began developing half a century ago [14] with the purpose of explaining the structure of human memory. By present there have been developed numerous models of networks and there is an extensive bibliography on the topic. The research performed in [15] indicates that many of the semantic networks studied, includ- ing those built on the basis of psycho-semantic associative experiments, have similar statistical characteristics. 2.2. Knowledge Graphs In modern studies aimed at word processing, the two most popular and more frequently used resources are thesauri (or lexical ontologies), such as WordNet [16], and Wikidata1 , i.e. a knowl- edge graph with multiple semantic relationships between concepts. WordNet was originally created to study human memory which enable to expect that the use of structures like WordNet will make it possible to advance in understanding the problem of text complexity. The semantic proximity of words is determined by their closeness in the structure of the knowledge base or thesaurus. Thesauri, however, are also one of the types of knowledge bases, since, unlike traditional dictionaries, they contain not only linguistic but also extralinguistic information, i.e. world knowledge. The latter is registered in thesauri with hypo-hyperonymic connections: for example, a tomato is connected to a class of vegetables, not garments. RuWordNet, a Russian thesaurus, presents the concept of “vegetables” as a member of a synonymic set, a set of hyponyms and a set of hyperonyms (https://ruWordnet.ru/ru/). The WordNet thesaurus is another tool widely used to measure semantic proximity of words. For example, by September 25, 2019, Google Scholar registered 58900 articles mentioning Word- Net as a tool and word similarity as a research objective. Over 3150 works on the topic appeared in Google Scholar only in September 2019. In [17], the authors offer the first systematic comparison of various measures of word prox- imity based on WordNet. In [18] the authors offer a broad survey of proximity metrics such as path length based measures, information content-based measures, feature-based measures, and hybrid measures. The innovative information-theoretic approach to measure the semantic similarity between concepts of WordNet is developed in [19]. In [20], it was proposed to assign weights to the edges of WordNet and determine the proximity of words based on “weighted WordNet”. 1 https://wikidata.org/ A comprehensive review of Russian thesauri is presented in [21]. It includes, in particular, information about RuThes and RuWordNet thesauri, i.e., Russian WordNet. In [22, 23], these thesauri are used to solve the problem of establishing semantic proximity of words. To the best of our knowledge, research on the degree of conceptual coherence of a text, and hence its complexity implementing this approach is still scarce . In article [6], DBpedia was used as a knowledge base, and Newsela article corpus (https://newsela.com/data) as a corpus of texts, containing original and artificially simplified texts. Entities expressed in the text by nominal groups were mapped onto knowledge base concepts. All DBpedia concepts, together with the semantic relationships linking them to each other, form a graph. As a result, the text is displayed on a subgraph of the complete DBpedia graph. The authors of the article reviewed 13 parameters of the graph and calculated their values for original and simplified texts ( total number of texts = 200). The research shows that all the parameters studied have a statistically significant relationship between metrics of the parameters of the graph and text complexity (at least in the case of significant differences in text complexity). Thus, this approach is viewed as reliable to assess texts complexity. In [24] the same authors propose a mechanism of dis- tributing activation in a network of concepts which may be implemented to model the effect of priming. As priming is viewed as a mechanism accelerating text comprehension [25], it offers researchers another instrument to evaluate conceptual complexity of texts. In this paper, we propose a novel approach to assess the conceptual complexity of texts. It differs from the approach proposed in [6] in many significant aspects: (1) implementing a WordNet-like thesaurus as a knowledge base, rather than a DBpedia knowledge base; (2) applying a set of structural features of the graph; (3) using natural texts of different conceptual complexity for testing the approach rather than artificially generated texts. 3. Materials and Methods 3.1. Datasets For the experiment with English, we use two datasets, the Newsela corpus and the Simple English Wikipedia corpus. The Newsela corpus contains the data of 1130 news articles. Each article has 5 versions (1 original text and 4 simplified versions). Thus, this dataset can be used in a multiclass classification task. Another corpus is the Simple English Wikipedia that contains some of Wikipedia articles written primarily in Basic English. The data used in this study in a binary classification setting. For experiment with Russian texts, we use Russian Readability Corpus (RRC). Russian Read- ability Corpus compiled for the current research comprises three sets of books, i.e. Social Sci- ence textbooks, History textbooks, Elementary school texts. Initially [4], RRC was compiled of two sets of textbooks on Social studies for secondary and high school for Russian students. It contained 45380 sentences from 14 textbooks: edited by Bogolyubov (BOG) and by Nikitin (NIK). Later, a dataset of 17 elementary school texts (1st – 4th grades) along with a dataset of 6 textbooks on History (10th – 11th grades) were added. 3.2. Linguistic resources and knowledge bases In experiments with English texts, we applied two types of knowledge graphs: WordNet and Wikidata; while for Russian, we use RuThes-Lite Thesaurus. RuThes Thesaurus of the Rus- sian language [26] is typically referred to as a linguistic knowledge base for natural language processing. The thesaurus provides a hierarchical network of concepts. Each concept has a name and is related to other concepts and to a set of language signs (words and phrases), the meanings of which correspond to the concept. The conceptual relations in RuThes include the following: • the class-subclass relation; • the part-whole relation; • the external ontological dependence, and others RuThes contains 54 thousand concepts, 158 thousand unique text entries (75 thousand single words), 178 thousand concept-text entry relations, over 215 thousand conceptual relations. The first publicly available version of RuThes (RuThes-lite2 ). The process of generating RuThes- Lite from RuThes is described in [27]. For the goals of the present study we have transformed RuThes-Lite into a graph where vertices are concepts and edges are relationships (we use the class-subclass, the part-whole relations as well as association relation). 3.3. A method for generating a graph from a text As it was mentioned above, for this research, we use knowledge graphs to estimate text com- plexity. The structure of a thesaurus is represented as a graph (𝐺0 ): with nodes derived from the concepts and edges derived from relations. While processing a text (or a fragment of a text) words from a text are matched with knowledge graph nodes concepts. Performing a proper matching of thesaurus concepts with raw text entries is an important step that usually involves disambiguation processes. However, the problem of automatic dis- ambiguation of a word sense in Russian has not been solved yet. Thus, when matching RuThes- Lite concepts with a text, we use a simple string, thus matching normalized words from the text and the text entries that correspond to the thesaurus concepts. We keep all the matching concepts in a temporary list. This process may produce a lot of false positives in the temporary list, i.e., concepts that were not used in the text fragment. On the next step, i.e., while building a subgraph, we filter out all isolated nodes. It is during this procedure the vast majority of false positives are excluded. In contrast, in experiments with English texts, we make use of well-developed disambiguation techniques that map words to WordNet’s synsets. The procedure of building a subgraph is straightforward: we use all matched concepts to produce a new graph 𝐺𝑆 as a subgraph of 𝐺0 . If two nodes are connected in 𝐺0 , then they are connected in 𝐺𝑆 too. In case of a false positive match, the false positive will remain in subgraph 𝐺𝑆 if and only if it has a connection with another falsely matched concept in the given text. Consequently, having two interconnected false positives is still possible, but having more 2 http://www.labinform.ru/ruthes/index.htm Figure 1: A subgraph 𝐺𝑆 derived from the RuThes-Lite and a text fragment. The subgraph contains only nodes and (hypernymy-hyponymy) edges between them (such edges are dark). Nodes without labels are kept in the figure as an illustration of the related RuThes concepts that did not appear in the text fragment, but they are linked to nodes from the subgraph 𝐺𝑆 than three (interconnected false positives) is a much rarer event. Limitations of the described procedure are obvious as some of the isolated nodes may still contain valuable information for further analysis. For example, a sample text segment from the 6-th grade textbook will be mapped to the corresponding subgraph of RuThes-Lite (Fig. 1); Figure 2 depicts a subgraph derived from a fragment from the Newsela corpus using WordNet. 3.4. Sampling and Graph-based features RRC contains 37 documents and thus can hardly be viewed as a representative sample of the population of all school textbooks. However, for the purposes of text complexity studies, we could split each document in multiple non-overlapping parts. If each part (or a sample) is ’long enough’ it can serve as a good representative of the whole document and at the same time will keep certain variability. As we have no assumptions on the idea of “big enough” in terms of the sample, we will denote the size of each sample as parameter S. This parameter is measured in tokens. In experiments, both dependent variables (such as readability value) and independent vari- ables measured for a given sample. The sample size (S) was set to different values: 200, 500, 1000 and 2000 tokens. During sampling, keeping the order of tokens and sentences is important; oth- erwise, the sampled texts will be less natural, even though they could carry the main features of the documents from the corpus. Thus, we sample S token sequences from each document3 . 3 The last sentence is not truncated, hence the size of a sample in experiments is at least S tokens and at most (S+k) tokens, where k tokens are used to keep the last sentence in the sample Figure 2: A subgraph 𝐺𝑆 derived from the WordNet and a text fragment. We calculate all features for readability analysis using the described sampling technique. Us- ing the technique, we can estimate the mean and range of feature metrics with several samples taken from RRC. Sampling from RRC produces a set of text fragments (a text sample and a text fragment are used as synonyms). A subgraph 𝐺𝑆 is generated from each sample. Previously in [28] we conducted experiments with RRC data to investigate the correlation between thesaurus-based features and complexity. Below, we describe features that we use in experiments with text complexity and provide results from [28]. In our experiments, for each 𝐺𝑆 we calculate the following features: • number of RuThes concepts (Co); • number of components (NC); • average component size (CS); • maximum node degree (MND); • total number of connected nodes (TCN); • average shortest path (ASP). Table 1 Pearson correlation between features based on the RuThes-Lite and the grade level S Co NC CS MND TCN TCN / Co ASP 200 0.45 0.86 0.32 0.53 0.84 0.76 0.27 500 0.62 0.86 0.68 0.45 0.88 0.79 -0.25 1000 0.56 0.38 0.73 0.40 0.85 0.69 -0.36 2000 0.88 0.40 0.81 0.68 0.86 0.70 -0.54 The correlation between features based on RuThes-Lite and the grade level presented in Table 1. In this table we highlight the values higher than 0.7 of the correlation coefficient. For each row of Table 1 we measured 50 random samples per a textbook from the RRC. In the next section, we present the results of experiments with English texts. 4. Experiments The set of features that correlate with text complexity in previous experiments (in Russian) was also tested in English. For experiments in English texts we chose a text classification task; so the set of graph features were tested on a different language. We apply the technology described above to texts in English in order to test it. Experiments for English texts were carried out on NEWSELA and Simple Wikipedia corpora. WordNet and Wikidata were used as knowledge bases. Disambiguation methods developed for the English language enable text annotation of high quality using WordNet concepts and methods for linking Wikidata entities to text (entity linking). Experiments have been carried out with classical machine learning methods (stochastic gradient descent, logistic regression, KNN, decision trees, and SVMs), as well as neural network graph models and graph embed- dings. 4.1. Classification of texts by complexity based on the knowledge graph of WordNet In experiments with classical machine learning methods, feature vectors were constructed with the help of specialized methods. In the beginning, each node from the WordNet hierarchy gets its own representation in the form of a vector ( which is done by node2vec method [29]). The classifiers were trained on SimpleWikipedia data and tested on a binary classification problem. The dimension of embeddings for WordNet concepts was chosen equal to 128. The classifica- tion accuracy reached 93%. When two additional features were added (the average word length or average sentence length, the accuracy increased to 95%). At the same time, the model trained exclusively on the "classical" features showed the accuracy of no more than 85%. Similar (in terms of accuracy) results were obtained in [30]. In this group of experiments with neural net- work graph models, feature vectors were also built using node2vec. The resulting vectors were assigned to the vertices of the graph G, after which a graph convolution network (GCN, [31]) was applied to the graph. We also used GCNs to construct a convolution of graph G and ex- tract features directly. The classifier was trained on data from the Simple Wikipedia corpus and Figure 3: Confusion matrix of Decision tree classifier on graph-based features (the Newsela dataset). tested on a binary classification. The dimension of embeddings for WordNet concepts equals 256. The classification accuracy on the test sample received was 60 – 65%. Parameters change did not result in improving the model quality. 4.2. Classification of texts by complexity based on the Wikidata knowledge graph To extract Wikidata concepts from text, researchers in [32]implemented the BLINK model. Pretrained vectors of dimension 200 (Pytorch-BigGraph [33] were used as vector representa- tions for the nodes of the Wikidata graph. The classifier was trained on NEWSELA dataset. The maximum classification accuracy obtained was 62.5% Parameters change did not result in improving the model quality (Fig. 3). General conclusions on the use of thesauri: classical machine learning models based on graph features demonstrated better results compared to graph convolution neural networks [31]. Models based on only "superficial" features (such as average word length or average sentence length, FOG, SMOG, and others derived from TextStat4 library) can be improved by adding vector representations trained on WordNet graph. However, the use of vectors built on Wikidata graph requires further research to analyse the sources of very moderate performance. 4 https://pypi.org/project/textstat/ 5. Conclusion Text complexity is of utmost importance both for textbook authors and students looking for educational materials. Modern methods and approaches of artificial intelligence, including knowledge bases, allow assessing conceptual complexity of texts, thus providing educators and students with instruments they need. The combined application of methods of computational linguistics and artificial intelligence can be successfully used when determining text complexity and thus contribute to significant progress in understanding the notion of text complexity at a deeper conceptual level. In the paper, we presented and evaluated graph-based complexity features. Such features can be extracted from text fragments using hierarchical structure of a thesaurus. A previous study was conducted on the material of the Russian texts and RuThes-Lite thesaurus only. We have evaluated correlation of the features with text complexity. The present work deals with texts in English. It is expected that similar results can be achieved for other languages with the help of the same methods as on the conceptual level, languages reflect the real world using the same cognitive mechanisms. Switching to another language needs changing the corresponding linguistic ontology. Therefore, we applied a simi- lar approach using WordNet. Despite the results are promising, there are still many research questions still open in the area. The perspective of the study lies in the detailed analysis of using knowledge graphs such as Wikidata and comparing the effectiveness of results derived with WordNet. While the current study shows somewhat mediocre performance of graph convolution networks, an increased number of features and their automatic selection could be very fruitful to detect relevant features of text complexity. Our work in a number of aspects can be treated as a benchmark applicable in further studies of conceptual complexity of texts. Acknowledgments The study presented in paper has been supported by the Kazan Federal University Strategic Academic Leadership Program (Sections 1–3 of the paper), and by the Russian Science Foun- dation, grant 18-18-00436 (Sections 4–5 of the paper). References [1] M. Solnyshkina, A. Kiselnikov, Slozhnost’ teksta: etapy izucheniya v otechestvennom prikladnom yazykoznanii [text complexity: study phases in Russian linguistics], Vestnik Tomskogo gosudarstvennogo universiteta. Filologiya [Tomsk State University Journal of Philology] (2015). [2] V. Ivanov, M. Solnyshkina, V. Solovyev, Efficiency of text readability features in Russian academic texts, Komp’juternaja Lingvistika i Intellektual’nye Tehnologii 17 (2018) 277– 287. [3] R. Reynolds, Insights from Russian second language readability classification: complexity-dependent training requirements, and feature evaluation of multiple cate- gories, in: Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications, 2016, pp. 289–300. [4] V. Solovyev, V. Ivanov, M. Solnyshkina, Assessment of reading difficulty levels in Russian academic texts: Approaches and metrics, Journal of Intelligent & Fuzzy Systems 34 (2018) 3049–3058. [5] B. Biryukov, B. Tyukhtin, O ponyatii slozhnosti [about the concept of complexity], V kn.: Logika i metodologiya nauki. Materialy IV Vsesoyuznogo simpoziuma. (1967) 219–231. [6] S. Štajner, I. Hulpus, Automatic assessment of conceptual text complexity using knowl- edge graphs, in: Proceedings of the 27th International Conference on Computational Linguistics, Association for Computational Linguistics, 2018, pp. 318–330. URL: http: //aclweb.org/anthology/C18-1027. [7] C. A. Denton, M. Enos, M. J. York, D. J. Francis, M. A. Barnes, P. A. Kulesz, J. M. Fletcher, S. Carter, Text-processing differences in adolescent adequate and poor comprehenders reading accessible and challenging narrative and informational text, Reading Research Quarterly 50 (2015) 393–416. [8] D. S. McNamara, A. Graesser, M. M. Louwerse, Sources of text difficulty: Across genres and grades, Measuring up: Advances in how we assess reading ability (2012) 89–116. [9] Y. A. Tomina, Ob’ektivnaya otsenka yazykovoy trudnosti tekstov (opisanie, povestvo- vanie, rassuzhdenie, dokazatel’stvo)[an objective assessment of language difficulties of texts (description, narration, reasoning, proof)], Abstract of Pedagogy Cand. Diss. Moscow (1985). [10] A. Laposhina, Relevant features selection for the automatic text complexity measurement for Russian as a foreign language, Computational Linguistics and Intellectual Technolo- gies: Papers from the Annual International Conference “Dialogue” (2017) 1–7. [11] D. McNamara, A. Graesser, P. McCarthy, Z. Cai, Automated evaluation of text and dis- course with Coh-Metrix, Cambridge University Press, 2014. [12] S. Crossley, D. McNamara, Text coherence and judgments of essay quality: Models of quality and coherence, in: Proceedings of the Annual Meeting of the Cognitive Science Society, volume 33, 2011. [13] S. Thornbury, Beyond the Sentence: Introducing Discourse Analysis, ELT Journal 60 (2006) 392–394. URL: https://doi.org/10.1093/elt/ccl033. doi:10.1093/elt/ccl033. arXiv:http://oup.prod.sis.lan/eltj/article-pdf/60/4/392/1346812/ccl033.pdf. [14] A. Collins, M. Quillian, Retrieval time from semantic memory, Journal of Verbal Learning and Verbal Behavior 8 (1969) 240–248. [15] M. Steyvers, J. B. Tenenbaum, The large-scale structure of semantic networks: statistical analyses and a model for semantic growth, Arxiv preprint cond-mat/0110012 (2001). [16] C. Fellbaum (Ed.), WordNet: an electronic lexical database, MIT Press, 1998. [17] A. Budanitsky, G. Hirst, Evaluating wordnet-based measures of lexical semantic related- ness, Computational Linguistics 32 (2006) 13–47. [18] L. Meng, R. Huang, J. Gu, A review of semantic similarity measures in wordnet, Interna- tional Journal of Hybrid Information Technology 6 (2013) 1–12. [19] T. Hong-Minh, D. Smith, Word similarity in wordnet, in: Modeling, Simulation and Optimization of Complex Processes, Springer, 2008, pp. 293–302. [20] M. G. Ahsaee, M. Naghibzadeh, S. E. Y. Naeini, Semantic similarity assessment of words using weighted wordnet, International Journal of Machine Learning and Cybernetics 5 (2014) 479–490. [21] N. S. Lagutina, K. V. Lagutina, A. S. Adrianov, I. V. Paramonov, Russian language thesauri: automated construction and application for natural language processing tasks, Mod- elirovanie i Analiz Informatsionnykh Sistem 25 (2018) 435–458. [22] N. Loukachevitch, A. Alekseev, Summarizing news clusters on the basis of thematic chains, in: Ninth International Conference on Language Resources and Evaluation (LREC- 2014), 2014, pp. 1600–1607). [23] D. Ustalov, Concept discovery from synonymy graphs, Vychislitel’nye tekhnologii [Com- putational Technologies] 22 (2017) 99–112. [24] I. Hulpus, S. Štajner, H. Stuckenschmidt, A spreading activation framework for tracking conceptual complexity of texts, in: Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019, pp. 3878–3887. [25] T. Gulan, P. Valerjev, Semantic and related types of priming as a context in word recog- nition, Review of psychology 17 (2010) 53–58. [26] N. Loukachevitch, B. Dobrov, RuThes linguistic ontology vs. Russian wordnets, in: Pro- ceedings of the Seventh Global Wordnet Conference, 2014, pp. 154–162. [27] N. Loukachevitch, B. Dobrov, I. Chetviorkin, Ruthes-lite, a publicly available version of thesaurus of Russian language ruthes, in: Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference “Dialogue”, volume 2014, 2014. [28] V. Solovyev, V. Ivanov, M. Solnyshkina, Thesaurus-based methods for assessment of text complexity in russian, in: Mexican International Conference on Artificial Intelligence, Springer, 2020, pp. 152–166. [29] A. Grover, J. Leskovec, node2vec: Scalable feature learning for networks, in: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, 2016, pp. 855–864. [30] Z. Jiang, Q. Gu, Y. Yin, D. Chen, Enriching word embeddings with domain knowledge for readability assessment, in: Proceedings of the 27th International Conference on Compu- tational Linguistics, 2018, pp. 366–378. [31] T. N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, arXiv preprint arXiv:1609.02907 (2016). [32] L. Wu, F. Petroni, M. Josifoski, S. Riedel, L. Zettlemoyer, Scalable zero-shot entity linking with dense entity retrieval, arXiv preprint arXiv:1911.03814 (2019). [33] A. Lerer, L. Wu, J. Shen, T. Lacroix, L. Wehrstedt, A. Bose, A. Peysakhovich, Pytorch- biggraph: A large-scale graph embedding system, arXiv preprint arXiv:1903.12287 (2019).