International Conference "Internet and Modern Society" (IMS-2020). CEUR Proceedings 117 Russian Literature Around the October Revolution: A Quantitative Exploratory Study of Literary Themes and Narrative Structure in Russian Short Stories of 1900–1930 Tatiana Sherstinova1,2 [0000-0002-9085-3378] and Tatiana Skrebtsova2[0000-0002-7825-1120] 1 National Research University Higher School of Economics, 123 Griboyedova Canal Emb., St Petersburg 190068, Russia 2 St. Petersburg State University, 7/9 Universitetskaya emb., 199034 St. Petersburg, Russia tsherstinova@hse.ru; t.skrebtsova@spbu.ru Abstract. The paper reveals the thematic content and plot structure of the Rus- sian short stories written in the 20th century’s first three decades. It presents part of the ongoing project aimed at a comprehensive study of the Russian short stories of this period, encompassing their thematic, structural and linguistic fea- tures. This particular period is targeted because it was marked by a series of dramatic historical events (Russo-Japanese war, World War I, February and Oc- tober revolutions, the Civil War, formation of the Soviet Union) that could not but affect Russian literature and language style. Within the project, a corre- sponding text corpus has been created, currently containing several thousands stories and thus allowing for a wide coverage of texts and their computer pro- cessing. On its basis, a random sample has been selected, serving as a testbed to probe preliminary observations and hypotheses. It is used in the paper to identi- fy prevailing themes, both major and minor, manifest and latent, as well as characteristic narrative structures and to trace the way they kept changing over the three decades. This helps to pinpoint certain features and tendencies which may be of interest to literary theorists and other scholars. Keywords: digital humanities, Russian literature, Russian short stories, literary themes, revolution, social changes, literary history, narrative structure, literary corpus. Introduction In this paper, we present recent results obtained within the ongoing project “The Rus- sian language on the edge of radical historical changes: the study of language and style in pre-revolutionary, revolutionary and post-revolutionary artistic prose by the methods of mathematical and computer linguistics (a corpus-based research on Rus- sian short stories)” [1; 2]. The project’s overall goal is to give a comprehensive ac- count of the early 20th century Russian short stories from the thematic, structural and linguistic perspectives [3]. Copyright ©2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 118 Computational Linguistics This particular period is targeted because it was marked by a series of dramatic his- torical events (Russo-Japanese war, World War I, February and October revolutions, the Civil War, formation of the Soviet Union) that could not but affect Russian litera- ture and language style. In particular, the October Revolution of 1917 is known to be one of the key topics of Russian literature of the XX century [4]. However, the liter- ary scholars have usually approached this topic from a purely qualitative viewpoint [5; 6; 7]. In our research, we set a goal to obtain preliminary quantitative assessment of literary changes in 1900–1930 in terms of themes distributions and narrative struc- ture modifications by dynamically comparing different chronological periods [8]. To accomplish this, a text corpus was created, containing several thousands of short stories written in Russia and later, the Soviet Union, and published in the timespan from 1900 to 1930 in literary journals or story books. This timespan is di- vided into 3 parts, 1900–1913, 1914–1922 and 1923–1930, the first covering the time before the great cataclysms, the second embracing World War I, February and Octo- ber revolutions and the Civil War, and the third accounting for the post-war socialist period. Each author may be represented by a single, randomly selected, story per pe- riod. To ensure robustness of the results, the corpus aims to take account of as many professional writers as possible, both famous (e.g. Anton Chekhov, Leo Tolstoy, Ivan Bunin, Maxim Gorky) and lesser-known ones, metropolitan and provincial alike [3]. From this corpus, a random sample was taken, containing 310 stories by 300 au- thors (some writers featuring in more than one period, this accounts for a slight dis- crepancy in numbers) [ibid.]. This sample serves as an initial testbed for linguists and literary scholars enabling them to put forward and prove (or disprove) preliminary conceptions concerning the Russian short stories of the early 20th century as a special genre, with its specific themes, plot structure and stylistic features. 1 Thematic Tagging 1.1 General Approach Identifying themes in works of literature is a rather difficult and controversial issue [9; 10]. First and foremost, the problem is that literary texts are often heavily laden with implicit meanings, as opposed, say, to academic or mass media discourse [11]. Thus there are no common statistical or computational techniques to be used for such a goal. Instead, a careful qualitative analysis and interpretation are needed, at least, initially. Automatic theme extraction procedures [12; 13] could be considered or de- vised later, once there is a certain amount of data at hand, but still it would be futile to fully rely on them. Thematic tagging which we are going to discuss here was done manually. Another difficulty concerning the thematic content of a literary text is that it nor- mally contains a handful of themes, like love, war, death and desolation, or, say, art, poverty, and suicide. In fiction, unlike other text types, themes are not hierarchically arranged, so that one cannot definitely tag one of them as dominant, or global, and others as subordinate, or local. International Conference "Internet and Modern Society" (IMS-2020). CEUR Proceedings 119 In theory, they can all be brought together in a single proposition, as suggested by Teun van Dijk [14: 134ff] for discourse topics in general, e.g. A poverty-stricken art- ist desperately needs money and, unable to sell his paintings, commits suicide. Obvi- ously enough, each story then will have an individual topic and there will be little chance for checking out regularities. We take a different approach. The basic idea is that thematic tagging of short sto- ries presupposes the identification of all semantic components that contribute to the plot, determine the protagonist’s motives and actions and directly bear on the conflict and its resolution. Each story thus is provided with a set of themes, similarly to the way componential analysis presents word meaning as a bundle of semantic features. The difference, though, is that while componential analysis aims to bring out the complete semantic content of a word, the set of themes does not fully define the short story plot. We proceeded as follows. A rough set of themes was drawn from the first period stories. It was subsequently tested against the short stories of the two other periods, with inevitable corrections, deletions and additions. The final set for the whole sample currently numbers 89 themes, ranging from political to personal, and from philosoph- ical to mundane. In the next section, we briefly touch upon some of them and comment on the fre- quency rates over the three periods. It is important to note that themes pertaining to socio-political agenda are likely to be evoked in fiction long after the events con- cerned. Thus, the Civil War is a theme in twice as many stories of the third period as those of the second one. The greater the event, the stronger the postponed effect. One should be aware of it when comparing the figures. 1.2 Theme Rates over Three Periods The initial three decades of the 20th century proved a difficult time in the Russian history. Defeat in the Russo-Japanese war (1904–1905), the subsequent political and social unrest, World War I, February and October revolutions of 1917, resulting in a radical transformation of economic, political and social life, and finally the Civil War (1917–1922) with its aftermath period could not fail to affect the Russian literature. It is but natural that these events are used in many stories as settings. We treat such political events as themes in case they play a key role in the plot. This is often the case with the war themes. With the revolutions, however, things are different. In a sense, almost all stories of the third period and some of the second could be marked by this tag since their contents would be deemed unrealistic had not the revolutions taken place. Nevertheless, we think it completely unnecessary to introduce a February revolution theme. As for the October revolution theme, only a couple of stories, spe- cifically highlighting the role of this event for the plot, are tagged with it (Fig. 1). Another thematic block closely associated with the sociopolitical context compris- es issues dealing with the country’s development policy adopted after the October revolution, such as technical progress, mass education, women’s emancipation, ex- plorations and inventions. They became particularly relevant after the end of the Civil War, during the third period (1923–1930) (Fig. 2). 120 Computational Linguistics Fig. 1. 1 – Russo-Japanese war, 2 – World War I, 3 – October revolution, 4 – Civil War Fig. 2. 1 – Technical progress, 2 – Mass education, 3 – Women’s emancipation, 4 – Explora- tions and inventions The process of instituting a new social order is a key theme in many stories of the third period. Sometimes the new order is explicitly set against the old one, with the former always evaluated positively and the latter, negatively. Such a neat divide is due to the fact that people disapproving of the October revolution and the subsequent transformations either had to leave the country or keep silent. It was impossible for authors denying new ideas and values to get their work published (Fig. 3). It was perhaps the peasant life that underwent the greatest transformations at this time. The October revolution totally eliminated the familiar pre-revolutionary pattern of the well-to-do families dwelling in large cities during the winter and moving to their countryside estates in the summer (where they naturally may have met peasants, but such encounters did not normally constitute a story theme). Instead, the protago- nists of the third-period stories either reside in the city (workers, clerks) or, most of- ten, are to be found in the rural settlements. There they are trying to survive in the absence of food, cattle, seeds, agricultural implements, horses or any other facilities. International Conference "Internet and Modern Society" (IMS-2020). CEUR Proceedings 121 Fig. 3. 1 – New ways of life, 2 – The old and the new Besides, there is a split between the poorer farmers who supported the revolution and sided with the Red Army during the Civil War and the wealthier ones, who do their best to retain the traditional lifestyle. The two groups fight over land and the new ways of things in general, sometimes with violence (Fig. 4). Fig. 4. 1 – Rural life vs. city life, 2 – Peasants life, 3 – Violence, 4 – Murder A major change can be seen in the relative frequency of such themes as Christian God (incorporating the concepts of faith, saints, sin and even devil) and religion as a social institution across the second and third periods. During the “military” period from 1914 to 1922, the concept of God, quite naturally, was among the key ones. After the ultimate victory of the Red Army, a peace time ensued, marked by an active anti- religious policy of the Soviet government. Spiritual issues are seldom (if ever) men- tioned in the literature of the third period. This is not the case, however, with religion as a social institution. Although from the quantitative viewpoint the third period looks exactly as the second one, the situation is different in two respects. First, in the third- period stories, the Christian church no longer enjoys the monopoly and has to make room for the Jewish and Buddhist religions. Second, the references to the church, priests, worshippers, etc. are outright derogatory or ironic, at best (Fig. 5). 122 Computational Linguistics Fig. 5. 1 – Christian God, 2 – Religion as a social institution One might think that there are timeless, core values in the human life, unlikely to be affected by political whirlpools and social life transformations. This may well be so as regards individual lives, but in the literature of tumultuous periods the focus is shifted towards large-scale public events. As a result, strictly personal topics like marriage, romantic love, unfaithfulness, jealousy, children, parental love gradually decline, becoming less prominent and frequent (Fig. 6). Fig. 6. 1 – Romantic love, 2 – Marriage, 3 – Unfaithfulness, 4 – Jealousy, 5 – Children, 6 – Parental love Interestingly, the sexual aspect of love and, more broadly, the body life over the three periods is on the rise (Fig. 7). Poverty, hunger, lack of money plagued people’s life more or less steadily. During the war the hardships obviously increased. They did not diminish after the end of the Civil War as the country was exhausted and near ruin. The economy was devastated, people were starving and dying from epidemics and lack of health-care. The number of stories highlighting the contrast between the rich and the poor and the crucial role of money after the revolution slightly went down, as there were no more wealthy peo- ple (Fig. 8). International Conference "Internet and Modern Society" (IMS-2020). CEUR Proceedings 123 Fig. 7. 1 – Mutual sexual love, 2 – Body life Fig. 8. 1 – Poverty, hardships, 2 – The rich and the poor, money Fig. 9. 1 – Death in the war, 2 – Natural death, 3 – Monotonous everyday life, boredom 124 Computational Linguistics The difference between the times of war and peace is most obviously reflected in the figures related to the death-in-the-war theme. An increase in this theme runs parallel to the decrease in the number of stories involving death from natural causes. Surpris- ingly, there is yet another thematic marker of peace times, and that is boredom. In the epoch of cataclysms, people do not have the luxury of monotonous everyday life (Fig. 9). 2 Narrative Structure 2.1 Conflict and Resolution It is commonly believed that works of fiction, in particular short stories, are bound to have a standard plot structure consisting of 5 parts: exposition, rising action, conflict, falling action, resolution [15]. Complications signaled at the beginning tend to in- crease and reach a climax, a turning point after which the main conflict unravels and is finally resolved [ibid.]. Curiously enough, this classical framework is rather often breached in short stories of all the three periods [16]. The non-canonical cases can be roughly divided into two groups. One contains sto- ries with no or little action, intentionally devoid of changes in the protagonist’s fate. The other embraces stories filled with small-scale events and local conflicts which, however, do not translate into a conclusive climax bringing about a new state of af- fairs in the protagonist’s life. Most often, this is done on purpose, but in some cases the deficient structure may result from the author’s poor writing skills. The total num- ber of stories in the two groups is about 30% for the first and second periods, in the third period it drops to roughly 25%. Fig. 10. Features of the composition: 1 – No climax, 2 – A number of small climaxes The short stories marked by a non-standard composition cannot be safely linked to particular themes. For example, quite often stories about poverty and hardships have no conflict and thus no resolution. This helps to highlight the protagonist’s hopeless position. If there were a conflict, it would be followed by a resolution bringing im- International Conference "Internet and Modern Society" (IMS-2020). CEUR Proceedings 125 portant change in the protagonist’s life, which would run counter to the author’s in- tentions. The same applies to such themes as monotonous everyday life, boredom, hard work. But such correlations are by no means a rule. The deficient structure is regularly found in short stories involving thoughts, remi- niscences, dreams, fantasies, mysticism, and supernatural. A whimsical temporal structure and a general lack of coherence characteristic of the phenomena concerned is reflected in the narrative, preventing a progressive unraveling of the plot. Many stories dealing with the new social order established after the October revo- lution have no obvious conflict or a conclusive resolution, either. The writers simply depicted the new order because it was novel and unusual, sometimes opposing it to the old way of things. What may seem strange at first sight is that quite a number of stories about politi- cal events also lack the canonical narrative structure. This is usually done on purpose to underline ineffective leadership, hesitation, stalemate, overall confusion or individ- ual futile efforts and despair. Such literary stories about hopeless, non-heroic situa- tions actually exhibit similar effects to the everyday stories about ethnic minorities that were shown to lack resolution more often than not [17]. Another conspicuous factor at work accounting for a loose narrative structure without a salient conflict and resolution, especially in what regards the short stories of the second and the third periods, is quite trivial. The October revolution and the sub- sequent radical transformations resulted in the emigration of many talented Russian writers, the vacancies being filled by lesser-known or unexperienced young authors whose professional competence or talent left much to be desired. Short stories by Dmitri Furmanov and Zinaida Richter mixing up fiction prose with documentary writ- ing are glaring examples of this sort. Indicative of the tendency are also numerous third-period stories pervaded by ideological evaluations which bring them close to newspaper articles of that time. Finally, it might be presumed that the strength of the short stories’ conflict and res- olution is partly determined by the national literature periods. Thus, it was shown in [18] that American 19th century stories tend toward greater resolution on the level of the plot than those of the 20th century. Closural states referred to in the 19th century- stories’ terminal sentences deal mostly with objective events (death, parting, mar- riage, an obstacle removed, a problem solved, a goal achieved) while in the 20th cen- tury there is a noticeable shift toward subjective and minor things like satisfaction [ibid.]. Naturally, literature periodization is not the same for different national tradi- tions, still the overall trend seems clear enough. 2.2 Narrative Modes Traditionally, third-person narration is the most commonly used narrative mode in literature. The first-person point of view is rather frequent in short stories, too. This holds for our sample. However, a few interesting details are worth mentioning Fig. 11). 126 Computational Linguistics Fig. 11. Narrative modes: 1 – 1st person narration, 2 – 3rd person narration, 3 – Alternating- person narration, 4 – Embedded story To begin with, the ratio of first-person narration to third-person narration is not the same over the three periods. In the stories written from 1923 to1930, there is an in- crease by roughly 5% in the former. The narrator thus is placed close to the reader and the unfolding story, making the latter seem more personal and subjective. The sense of subjectivity is even stronger felt in alternating-person narration, which was constantly on the rise starting with 3 for the pre-war stories to 5 in the war period and then up to 12 in the post-war period. Why such an increase? In the 1923–1930 period, with the communist control of the country well-assured, there arose a need to promote the alleged advantages of the new order. In many sto- ries an ideological component was made explicit by narrator’s first-person comment, usually placed at the end of the story and separated from the body text by asterisks or even marked as “Afterword” (e.g. stories by Sivachov and Zorich). In such cases, the reader initially takes it to be third-person narration, and all of a sudden at the end of the story comes across a first-person evaluation of the plot. Such structure is not found in the stories of the other two periods. Thus, what is most peculiar about the grown number of stories involving alternat- ing narration, is not the numbers as such but rather the purpose. While this narration type is generally used mostly to impart a personal note, in the socialist period it often served to introduce ideology. It may be said that the relatively high percentage of stories involving alternating narration are due to the need (perceived by the writers) to express an explicit evaluation of the new order. This is yet one more aspect which enables to draw a parallel between the literary prose of the 1923–1930 period and everyday stories (see also above). A classical way to combine different points of view in narration is embedded nar- rative, or a story within a story. The number of such cases is more or less stable across all the three periods. As a rule, it is the embedded story that has a canonical structure while the frame story lacks conflict and resolution. The only exception found in the sample is Vladimir Korolenko’s Frost which has a full-fledged composition in both International Conference "Internet and Modern Society" (IMS-2020). CEUR Proceedings 127 frame and embedded stories. Leonid Leonov’s Tramp is another interesting case in point as it has two subsequently embedded stories. Conclusion In this paper we have touched upon two of the three aspects defining a genre, to wit, themes and composition. Although the linguistic aspect has been deliberately left out, the overall picture is clear. The short stories published in the third period are quite different from those of the first period in both thematic and structural aspects. New themes emerged while some of the old ones dropped in frequency or radically changed in evaluation (e.g. religion as a social institution). The latter in particular illustrates the need for qualitative rather than purely statistical analysis. Some stories of the third period exhibit a quite special structure, marked by the narrator’s explicit comment on the ideological gist of the plot. Such weird component, untypical of the fiction prose in general, was prompted by external factors discussed above. It is totally absent from the stories of the previous periods and, it might well be assumed, will be seldom, if ever, found in more recent literature. The second-period stories cannot be viewed as a “bridge” between the literature of the two peace periods. They have a distinct character of their own shaped by the large-scale political and military events. As concerns the composition, however, these short stories pick up and continue the traditions of the classical Russian literature and as such are closer to the first-period ones. Due to inertia, this is true for the post- revolution years as well, including the Civil War. Thus, the above-mentioned post- poned effect holds not only for the stories’ thematic content but also for their struc- ture. The quantitative data obtained should be judged as preliminary, since we have ex- amined only a small portion of Russian literature of the designated period. The pro- posed methodology seems promising for the analysis of literary corpora in general, the number of which is constantly on the rise in the digital humanities research [20; 21]. Acknowledgement. The research is supported by the Russian Foundation for Basic Research, project # 17-29-09173 “The Russian language on the edge of radical historical changes: the study of language and style in prerevolutionary, revolutionary and post-revolutionary artistic prose by the methods of mathematical and computer linguistics (a corpus-based research on Russian short stories)”. References 1. Martynenko, G.Ya., Sherstinova, T.Yu., Melnik, A.G., Popova, T.I.: Methodological prob- lems of creating a Computer Anthology of the Russian story as a language resource for the study of the language and style of Russian artistic prose in the era revolutionary changes (first third of the 20th century). In: Computational linguistics and computational ontolo- gies. Issue 2. Proceedings of the XXI International United Conference ʻThe Internet and 128 Computational Linguistics Modern Societyʼ, IMS-2018, St. Petersburg, May 30 - June 2, 2018 Collection of scientific articles, ITMO University, St. Petersburg. Pp. 99–104 (2018). [In Rissian]. 2. Martynenko, G.Ya., Sherstinova, T.Yu., Popova, T.I., Melnik, А.G., Zamirajlova, E.V.: On the principles of creation of the Russian short stories corpus of the first third of the 20th century. Proceedings of the XV International Conference on Computer and Cognitive Linguistics ʻTEL 2018ʼ, Kazan. Pp. 180–197 (2018). [In Rissian]. 3. Martynenko, G., Sherstinova, T.: Linguistic and Stylistic Parameters for the Study of Lit- erary Language in the Corpus of Russian Short Stories of the First Third of the 20th Centu- ry. In: R. Piotrowski's Readings in Language Engineering and Applied Linguistics-2019, St. Petersburg (in print) (2020). 4. Brown, E. J.: Russian Literature Since the Revolution: Revised and Enlarged Edition, Har- vard University Press (1982). 5. Trotsky, L. D.: Literature and Revolution, University of Michigan Press (1960). 6. Erlich, V.: Modernism and Revolution: Russian Literature in Transition, Harvard Universi- ty Press (1994). 7. Romodanovskaya, E. (ed.): Story-motive complexes of Russian literature. Novosibirsk, Geo (2012). [In Rissian]. 8. Martynenko, G., Sherstinova, T.: Emotional Waves of a Plot in Literary Texts: New Ap- proaches for Investigation of the Dynamics in Digital Culture. In: Alexandrov D., Boukha- novsky A., Chugunov A., Kabanov Y., Koltsova O. (eds) Digital Transformation and Global Society. DTGS 2018. Communications in Computer and Information Science, vol 859. Springer, Cham. https://doi.org/10.1007/978-3-030-02846-6_24, Pp. 299–309 (2018). 9. Seigneuret, J.C.: Dictionary of Literary Themes and Motifs. Vol. 1–2, Greenwood (1988). 10. Beardsley, M.: Theme and Form: an Introduction to Literature. Prentice - Hall, Inc./A Si- mon & Schuster Company (1956). 11. Blummer, B., Kenton, J.M.: Academic Libraries’ Outreach Efforts: Identifying Themes in the Literature, Public Services Quarterly, Volume 15, Issue 3, pp. 179–204 (2019). 12. Blei, D. M., Lafferty, J. D.: Dynamic topic models. In Proc. 23rd International Conference on Machine Learning, pp. 113–120 (2006). 13. Wang, Q., Z. Cao, J. Xu, and H. Li: Group matrix factorization for scalable topic model- ing. In Proc. 35th SIGIR Conf. on Research and Development in Information Retrieval, pp. 375–384 (2012). 14. van Dijk, T.: Text and Context: Explorations in the Semantics and Pragmatics of Dis- course. Longman, London and New York (1977). 15. Murfin, R., Ray, S. M.: The Bedford Glossary of Critical and Literary Terms. Third Edi- tion. Bedford/St. Martin's (2009). 16. Skrebtsova, T. G.: Narrative structure of the Russian short story in the early XX century. Proceedings of the International Conference ʻCorpus Linguistics-2019ʼ. St. Petersburg: Publishing House of St. Petersburg University. Pp. 426–431(2019). [In Rissian]. 17. van Dijk, T.: Prejudice in Discourse: An Analysis of Ethnic Prejudice in Cognition and Conversation. John Benjamins, Amsterdam and Philadelphia (1984). 18. Lohafer, S.: A cognitive approach to storyness. In: May, Ch. (ed.) The new short story the- ories, pp. 301-311. Ohio University Press, Athens (1994). 19. Carter, B. (ed.): Digital Humanities: Current Perspective, Practices, and Research. Emerald Publishing Limited (2013). 20. Warwick, C., Terras, M. M., Nyhan, J.: Digital Humanities in Practice. Facet Publishing in association with UCL Centre for Digital Humanities, London (2012).