Language Engineering Methods in Forensics: a Study of the Authorship of Texts Attributed to Lavrenty Beria ∗ Mikhail Marusenko 1,2 Vadim Petrov1 mamikhail@yandex.ru vadim.petrov1953@mail.ru Xenia Piotrowska 2 Sergey Bogdanov 2 krp62@mail.ru rector@herzen.spb.ru 1 Saint-Petersburg State University, 2 Herzen State Pedagogical University of Russia, St. Petersburg, Russian Federation Abstract Abstract: This paper discusses the need and possibility of applying quantitative lin- guistics methods to forensic investigation. The results of the examination of the texts appearing in the Beria case (“Beria’s letters from prison”, interrogation protocols, the last words of the defendants) testify to the falsification of the trial and the entire process of removing Beria from power. Keywords: forensic science, quantitative linguistics, historical expert examination, Beria case 1 Introduction More than 66 years have passed since the day when Lavrenty Beria was removed from the Soviet political Olympus. Yet, there is no unanimity among scholars even as to when Beria was actually killed - immediately after the arrest, in early July 1953, or, according to the official date of his execution, at the end of December of that year. From July 1953 until the mid-1990s, the name of L. Beria was steadily associated with the words "spy", "vile killer", "foreign intelligence agent", "sexual maniac", "rapist" and the like. Thanks to N. Khrushchev, Beria became the main monster of the Stalin’s era. Only recently have historians begun to re-evaluate their attitude to him [Chertinov, 2018a, 20]. ∗ Copyright c 2019 for this paper by its authors. Use permitted under Creative Commons License Attri- bution 4.0 International (CC BY 4.0). 1 Regarding the crimes attributed to Beria, A. Dugin and V. Shepelev note that in the materials they studied, including the recently published draft indictment in the Beria case, there was not a single serious crime, confirmed by factual evidence, that could be qualified as treason or espionage in favor of foreign states, nor was there one in the minutes of the Plenum of the Central Committee of the CPSU, which took place in July 1953 [Dugin, Shepelev, 2015]. Under N. Khrushchev, the name of L. Beria was not only hushed up, but was also removed from books. After his death, Beria was not only deleted from Soviet history, but even his name was removed from encyclopedias [Prudnikova, 2014]. In addition, Khrushchev and his associates completely ignored anything good Beria have done for strengthen the power and defense capability of the Soviet Union. This created a somewhat paradoxical situation, which many researchers do not notice or do not want to notice, because only L. Beria remains convicted of crimes, many of which he haven’t done or have done but not only himself [Kudriashov, 2017]. After the pre-war purges, after the difficult military and post-war years, the population negatively perceived security services. Conspirators against Beria skillfully took advantage of, among other circumstances, the internal political situation that had developed in the USSR after the death of J. Stalin. All the conspirators had to do is to convince delegates of the Plenum of the Party Central Committee and members of the Supreme Council of the treachery and treason of L. Beria. Only after 1991 did the ban on Lavrenty Beria start to be lifted, and “now there is no longer an unambiguous assessment of the activities of Stalin’s associate. Nevertheless, an official, objective assessment of this contradictory personality of the twentieth century has not yet been given” [Kudriashov, 2017]. 2 Forensic aspect of the Beria’s case A number of scholars believe that the murder of Beria occurred in the afternoon of June 26, 1953 or shortly after this date. They point to General K. Moskalenko, as the direct executor of the murder [Mukhin, 2002][Pereyaslavov, 2002]. V. Chertinov also mentions Beria’s death on June 26, 1953, but he point to a conspiracy against Joseph Stalin as the motive for the murder. According to the official version, the military, led by Marshal G. Zhukov, took Beria directly from the courtroom, and in December after a closed trial he was shot. The non-official version is that Beria didn’t even reach the meeting, but had been killed in the morning, his Moscow house was stormed [Chertinov, 2018b]. N. Kudryashov asking a number of completely logical questions consider that after Stalin’s death, Lavrenty Beria was doomed [Kudriashov, 2017][Dugin, Shepelev, 2015, 48]. The infor- mation provided by Kudryashov convincingly indicates that Khrushchev and Co. had very serious motives in order to immediately kill Beria, not entrusting his interrogations to even the most trusted people. At the July 1953 Plenum of the Central Committee of the CPSU, N. Bulganin named Malenkov, Khrushchev and Molotov as the organizers of the conspiracy against Beria. N. Khrushchev himself, in his brief reply, ranked Bulganin as one as well [Be- ria, 1999, p. 258]. He marked, that Bulganin had committed a slip of the tongue that meant that there was no conspiracy against Beria, since he had acted alone [Beria, 1999, 255]. But Khrushchev himself had many reasons to eliminate Beria as soon as possible because of the circumstances of Khrushchev’s previous work [Sever, 2018, 33]. Some of the speeches at this 2 Plenum transparently hint that Beria is already dead [Kudriashov, 2017, 201], but none of the speakers at the July 1953 Plenum of the CPSU Central Committee could bring clear evidence of the urgent need for the arrest of Beria [Tskvitariya, 2015, 350]. Other researchers, considering the presence of agences of foreign special services agents among senior officials of the USSR, also drew attention to the possibility of Western intelligence services participating in the deaths of both Stalin and Beria [Dugin, Shepelev, 2015] [Yerashov, 2010, 15-16]. For example, if Bulganin was an agent of the CIA, he could participate in the conspiracy against Beria not only as an ally of Khrushchev, but also as an agent of the CIA or British intelligence. Another, rather implicit reason for the liquidation of Beria, could have been considerations in the activities of Beria, the primacy of pragmatism over ideology [Tskvitariya, 2015, 351]. Based on all the facts cited above, it can be concluded that the circumstances of Beria’s death is very contradictory. L. Beria was most likely killed by the conspirators either on June 26, 1953, or somewhat later, and did not live up to the officially announced date of his execution on December 23 of the same year. Next, we can proceed to consider the linguistic aspect of the “Beria’s case,” in particular, the level of Beria’s knowledge of the Russian language. 3 Linguistic analysis of authorship in the Beria’s case For many years, Beria’s deputy and main assistant V. Merkulov served as his chief “speech- writer” [Kudriashov, 2017, 211]. There are a lot of other evidences that indicates that Beria was not sufficiently educated, did not speak Russian well [Politburo, 2012]. Now, it is already certain that other people wrote for him a number of works whose authorship was officially attributed to him. It seems that letters from prison should confirm the fact that Beria was alive, if he had been able to write them. But even they are by no means evidence that Beria was alive at any moment after the writing of the last of letters from prison. Some researchers also pay attention to the semantic features of Beria’s letters. For exam- ple, the initial phrase of the first letter, dated June 28, 1953: "I was sure that from that big criticism at the Presidium I would draw all the conclusions I needed and would be useful in the team..." poses a lot of questions because the meeting of the Presidium, at which serious accusations of Beria’s anti-state and anti-party actions were first made, is known to have taken place only on June 29, 1953 [Dugin, Shepelev, 2015, 47]. Z. Tskvitaria draws attention to some doubtful signs of the authenticity of "Beria’s letters from prison": His opinion deserves the closest attention, because of strange accent of letter’s postcriptum: "C-des, I apologize that I am not writing very coherently and poorly because of my condition, but also due to weak lighting and the absence of pince-nez (glasses)" [Tskvi- tariya, 2015, 330-331]. May be Beria needed to make excuses for his bad style, because of this postscript was made only for the purpose that the letter itself would not cause unnecessary questions if it were revealed that the writing style and calligraphy were different from those of Beria? Despite this, no one has yet bothered to conduct a handwriting or any other ex- amination ... it is especially interesting that in the documents published by the Democracy Foundation these letters are listed as copies. If this foundation was not given the opportu- nity to get acquainted with the original, I think the question of examination will remain an unresolved problem. In attempts to determine the authenticity of certain handwritten documents that are considered to be written by L. Beria, researchers tend not to go beyond a handwriting exami- 3 nation. But even this method sometimes gives positive results. A logical step in the study was an independent handwriting examination conducted by the expert E. Dolzhansky (certificate No. CS7.001.001C). The expert found differences in particular features of the handwriting and concluded that the differences in the general and particular features of the handwriting are significant, stable and comprise two different individual sets of handwriting attributes, which means that the author of the text of "Letter to the Presidium of the CPSU Central Committee" is one person, and the author of the sample is another person [Dugin, Shepelev, 2015, 47]. Analysis of all the above information allows us to come to the following conclusion: L. Beria could not be the author of "Beria’s letters from prison". 4 Materials and methods To conduct a study of the authorship of “Beria’s letters from prison”, we have already applied quantitative linguistic analysis methods using the Stylo package written in the programming language R [Eder et al., 2016]. The intertextual measure Delta proposed by J. Burroughs in 2001 [Burrows 2002] was tested by many researchers on large volumes of heterogeneous text data: • English prose of 20th century [Hoover, 2004], • modern English poetry [Hoover, 2005], as well as poetic works in Latin [Rybicki et al., 2011], • prose of big formes in English, French, Italian, German, Polish, Hungarian languages, as well as in Latin and Arabic [Rybicki et al., 2011; Evert et al. 2015; Jannidis et al., 2015], • political texts in English [Savoy, 2015]. If we have a set of n words of interest for research, against which the Delta measure will be calculated. We’ll call this set of words as wi , defining fi (D) as the frequency of the word wi in the text D, and µi , - as the average frequency of the word in the sample, and σi - as the standard deviation of this frequency. Then the standardized estimate, or z -estimation of the frequency of use of the word wi in the text D is calculated by the formula fi (D) − µi z(fi (D)) = . (1) σi Thus, we have the following mathematical expression corresponding to the average of the absolute values of the differences of standardized estimates of word frequencies from wi between the texts D and D0 , called as Delta measure: n 0 1X ∆(D, D ) = |z(fi (D)) − z(fi (D0 ))|. (2) n i=1 This formula can be converted as follows: n n 0 1X 0 1 X fi (D) − µi fi (D0 ) − µi ∆(D, D ) = |z(fi (D)) − z(fi (D ))| = − = n i=1 n i=1 σi σi 4 n 1 X fi (D) − fi (D0 ) = . (3) n i=1 σi his transformation shows that the Delta is actually independent of the average frequencies of the words µi in the sample, and it can be considered as a normalized measure of the difference between the frequencies of each of the words in the texts D and D0 . Since taking the average involves dividing the sum by a constant equal to the number of words n under consideration, this action can be neglected when comparing the calculated results. Therefore, the formula can be converted to the form: n 0 X 1 ∆(D, D ) = |fi (D) − fi (D0 )|, (4) σ i=1 i those, the Delta measure with respect to the pair of texts (D, D0 ) equal to the sum over the set of words wi of the absolute values of the word frequency differences between the texts D and D0 divided by the standard deviation of σi . Using Delta in the attribution problem, we try to compare authorship authors for the text 0 D by evaluating the multidimensional distance to the text D, and each dimension (frequency of word usage) is scaled by the factor σ1i (i.e. , small deviations can affect the result if they belong to a measurement with a small frequency spread) [Argamon, 2008]. To study texts in synthetic languages M. Eder suggested for increased weight of frequently used words new modification of Delta, which was named as Eder’s Delta [Eder et al., 2016]. It was an and proposed by: n  |fi (D) − fi (D0 )| n − i + 1  (n) 1 X ∆E (D, D0 ) = · . (5) n i=1 σi n Clustering algorithms are often used together with Delta to obtain the results in the form of a dendrogramе. In this study, we used a clustering algorithm that forms clusters by Ward method. 5 Results and Discussion As a result, it was found that • the intellectual authorship of the so-called “Beria’s letters from prison”, which date back to the period between June 28 and July 2, 1953, most likely belongs not to one person, but to a whole group of persons (speechwriters); • these persons had previously taken part in the writing of speeches and articles for L. Be- ria; therefore, they were instructed to falsify these letters. These speechwriters included V. Merkulov, who, until his arrest in September 1953, held the post of Minister of State Control, and earlier was one of the persons closest to Beria. Merkulov most likely played a leading role in the group of people who wrote these letters on behalf of Lavrenty Beria, because Beria’s letters and his speech dated 03.04.1937, as well as Merkulov’s interrogation protocol and his letters gradually form one cluster (Fig. 1) [Petrov et al., 2019, 601-602]. 5 On the whole, the results of the authorship research confirm the hypothesis that the so- called “Beria’s letters of from prison” and the protocols of his interrogations were written after Beria’s death by V. Merkulov and an unknown third party or third parties. His “accomplices” allegedly participated in the so-called “trial” of L. Beria as defendants: • V. Merkulov, USSR Minister of State Control at that time; • V. Dekanozov, Minister of the Interior of the Georgian SSR; • B. Kobulov, Deputy Minister of Internal Affairs of the USSR; • S. Goglidze, Head of Department 3 of the Ministry of Internal Affairs of the USSR; • P. Meshik, Minister of Internal Affairs of the Ukrainian SSR; • L. Vlodzimirsky, Head of the Investigation Unit for Particularly Important Cases of the USSR Ministry of Internal Affairs. At the end of the judicial investigation, the defendants were given the last word [Sukhomlinov, Murin, 2002]. However, there is not a single face-to-face or cross-examination protocol in the case. Yu. Mukhin summarizes the data that he cited in his book as follows: “All the facts discussed above can be explained only in one way - there were no trials of Beria or of his comrades-in-misfortune. And all the cited “protocols” of this court are fakes concocted under the guidance of Rudenko" [Mukhin, 2002, 301]. Figure 1: Clustering results of “Beria’s letters from prison” and interrogations of Beria and Merkulov 6 An additional argument confirming the death of L. Beria earlier on December 23, 1953 is based on the results of our forensic analysis of the order given to Colonel-General P. Batitsky to execute L. Beria [Sukhomlinov, 2003, 432] and its comparative analysis vs. other similar regulations of the Military Collegium of the Supreme Court of the USSR. We examined these documents using the recommendations of E. Elagina: “From the point of view of legal status, the following groups of documents can be distinguished: genuine and false. An original document must satisfy the following criteria: there should be no doubt about the source of the document; there should be no doubt about the authenticity of the facts stated in the document; there should be no doubt about the method of manufacturing the document and the absence of unacceptable changes made to the document after completion of its production and execution" [Kriminalistika, 2019, 202]. For all three criteria listed by E. Elagina, serious doubts arose as to the authenticity of the "Order to Batitsky." An analysis of the forensically significant elements of the order addressed to Colonel- General P. Batitsky, with similar elements of other investigated instructions for the execution, showed significant differences between them. This circumstance allows us to come to the conclusion about the falsity of the instruction addressed to Batitsky about Beria’s execution. Figure 2: Clustering of the “Last words of the defendants” 7 Therefore, the second stage of the authorship research was to establish the authorship of the texts of the last words of the defendants (Table 1). Considering that “Beria’s letters from prison” were found to be false at the first stage, the hypothesis was put forward that the whole trial was falsified after the death of the main defendant, and that the last words of the defendants were also falsified. The results of text processing using the Stylo package are shown in Fig. 2. It shows that seven texts were divided into two clusters: one includes the last words of Merkulov, Dekanozov and Kobulov, the second contains the last words of Meshik, Goglidze, Beria and Vlodzimersky. It can be assumed that the authors of all the texts were two unknown persons who were entrusted with creating the appearance of a real trial and falsifying necessary documents. References [Argamon, 2008] Argamon S. Interpreting Burrows’s Delta: Geometric and Probabilistic Foundations. // Literary and Linguistic Computing Vol. 23, No. 2. Рр. 131–147. [Burrows, 2002] Burrows J.(2002) ‘Delta’: a Measure of Stylistic Difference and a Guide to Likely Authorship //Literary and Linguistic Computing, Vol. 17, No. 3. Рр. 267-287 [Beria, 1999] Lavrenty Beria. 1953. Stenogramma iyul’skogo plenuma TSK KPSS i drugiye dokumenty. Pod red. akad. A. N. Yakovleva; sost. V. Naumov, YU. Sigachev. M.: MFD, 1999. [Lavrenty Beria. 1953. Transcript of the July plenum of the CPSU Central Commit- tee and other documents. Ed. Acad. A. N. Yakovleva; comp. V. Naumov, Yu. Sigachev. M .: MFD, 1999]. [Chertinov, 2018a] Chertinov V. 112 dney "ottepeli". Za chto ubili Beriyu? // Vash taynyy sovetnik. 2018. Aprel’. № 4 (46). S. 20 - 21. [Chertinov V. 112 days of the thaw. Why did they kill Beria? // Your Privy Advisor. 2018. April. No. 4 (46). P. 20 – 21]. [Chertinov, 2018b] Chertinov V. Shekspirovskiy final. Umirayushchiy Stalin komu-to grozil // Vash taynyy sovetnik. 2018. Aprel’. № 4 (46). S. 12 - 14. [Chertinov V. Shakespeare Final. A dying Stalin threatened someone // Your secret adviser. 2018. April. No. 4 (46). P. 12-14]. [Dugin, Shepelev, 2015] Dugin, A. N., Shepelev, V. N. Dokumenty RGASPI ob ustranenii L. P. Berii // Otechestvennyye arkhivy. 2015. № 3. S. 44-48. [Dugin, A.N., Shepelev, V.N. Documents of the Russian State Autonomy and Information Agency on the Elimination of L.P. Beria // Domestic Archives. 2015. No. 3. P. 44-48]. [Eder al., 2016] Eder M., Rybicki J., Kestemont M. 2016. Stylometry with R: A package for computational text Analysis // The R Journal. 2016. Vol. 8 (1). P. 107–121. [Evert et al. 2017] Evert St., Proisl Th., Jannidis F., Reger Is., Pielström St., Schöch Chr., Vitt Th. (2017) Understanding and explaining Delta measures for authorship attribu- tion//Digital Scholarship in the Humanities, Vol. 32, Issue 2, 2017. Pp. 114–116 [Grieve, 2007] Grieve, J. (2007) Quantitative Authorship Attribution:An Evaluation of Tech- niques//Literary and Linguistic Computing, Vol. 22, No. 3, Pp. 251-270 8 [Jannidis et al., 2015] Jannidis F., Pielstrom St., Schoch C., Vitt Th. (2015) Improving Bur- rows’ Delta – An empirical evaluation of text distance measures.//Digital Humanities Conference, 2015, Sydney. [Juola, 2009] Juola P. (2009) JGAAP: A System for Comparative Evaluation of Authorship Attribution//JDHCS 2009, Vol. 1 Nо. 1 [Hoover, 2005] Hoover D. (2005) Delta, Delta Prime, and Modern American Poetry: Author- ship Attribution Theory and Method //Proceedings of the 2005 ALLC/ACH Conference [Hoover, 2004] Hoover D. L. (2004) Testing Burrows’s Delta //Literary and Linguistic Com- puting, Vol. 19, No. 4. Рр. 453–475 [Kudriashov, 2017] Kudryashov N. A.. Beria i sovetskiye uchenyye v Atomnom proyekte. Kn. 2: Sud’ba Lavrentya Berii. M.: LENAND, 2017. [Kudryashov N. A. Beria and Soviet scientists in the Atomic Project. Vol. 2: The fate of Lawrence Beria. M .: LENAND, 2017]. [Petrov al., 2019] Petrov V. V., Marusenko M. A., Piotrovskaya K. R., Mañas I. N., Ma- mayev N. K. Ob avtorstve «pisem Berii iz zatocheniya» // Vestnik Sankt-Peterburgskogo universiteta. Pravo. 2019. T. 10. Vyp. 3. S. 586-605. [Petrov V.V., Marusenko M.A., Pi- otrovskaya K.R., Mañas I.N., Mamaev N.K. On the Authorship of “Beria’s Letters from prison” // Bulletin of St. Petersburg University. Law. 2019. Vol. 10. Issue. 3, pp. 586-605]. [Pereyaslavov, 2002] Pereyaslov N. V. Tayna dvukh vystrelov. M.: GALA PRESS, 2002 [Pereyaslov N.V. The Mystery of Two Shots. M .: GALA PRESS, 2002]. [Politburo, 2012] Politbyuro i delo Beria. Sbornik dokumentov / Pod obshch. red. O. B. Mo- zokhina. M.: Kuchkovo pole, 2012. [Politburo and the Beria case. Collection of documents / Gen. ed. O. B. Mozokhin. M .: Kuchkovo field, 2012]. [Prudnikova, 2014] Prudnikova Ye. A. Strategiya Pobedy. M.: OLMA Media Grupp, 2014. [Prudnikova E.A. Victory Strategy. M .: OLMA Media Group, 2014]. [Mukhin, 2002] Mukhin YU. I. Ubiystvo Stalina i Beria. Nauchno-istoricheskoye rassle- dovaniye. M.: KRYMSKIY MOST-9D, FORUM. 2002 [Mukhin Yu. I. The assassination of Stalin and Beria. Scientific and historical investigation. M .: CRIMEAN BRIDGE-9D, FORUM. 2002]. [Prudnikova, 2017] Prudnikova Ye. A. Vtoroye ubiystvo Stalina. M.: Veche, 2017. [Prudnikova E.A. The second assassination of Stalin. M .: Veche, 2017]. [Sever, 2018] Sever A. Beria i NKVD nakanune i v gody Velikoy Otechestvennoy voyny. Moskva: Rodina, 2018. [Sever A. Beria and the NKVD on the eve of and during the Great Patriotic War. Moscow: Homeland, 2018]. [Tskvitariya, 2015] Tskvitariya Z. CH.. Beria bez lzhi. Kto dolzhen kayat’sya? M.: Yauza- press, 2015. [Tskvitaria Z. Ch. Beria without a lie. Who should repent? M .: Yauza-press, 2015]. 9 [Yerashov, 2010] Yerashov V. P. Ubiytsy v belykh khalatakh, ili Kak Stalin gotovil yevreyskiy pogrom. M.: Eksmo: Algoritm, 2010 [Erashov V.P. Assassins in white coats, or How Stalin prepared the Jewish pogrom. M .: Eksmo: Algorithm, 2010]. [Sukhomlinov, Murin, 2002] Sukhomlinov, A. V., Murin, YU. G. Prigovor okonchatel’nyy i obzhalovaniyu ne podlezhit. Posledniye slova podsudimykh i prigovor po delu Berii i yego soobshchnikov. № 6(60)/2002. M.: 2002. S. 74-89. [Sukhomlinov, A.V., Murin, Yu. G. The verdict is final and is not subject to appeal. The last words of the defendants and the verdict in the case of Beria and his associates. No. 6 (60) / 2002. M .: 2002.S. 74-89]. [Sukhomlinov, 2003] Sukhomlinov A. V. Kto vy, Lavrenty Beria? M.: Detektiv-Press, 2003 [Sukhomlinov A.V. Who are you, Lavrenty Beria? M .: Detective Press, 2003]. [Kriminalistika, 2019] Kriminalistika: uchebnik / kollektiv avtorov; pod red. T. A. Sedovoy, S. P. Kushnirenko, V. D. Pristanskova. Moskva: YUSTITSIYA, 2019 [Forensics: a textbook / team of authors; under the editorship of T.A. Sedova, S.P. Kushnirenko, V.D. Moscow: JUSTICE, 2019]. [Rybicki et al., 2011] Rybicki J., Eder M. (2011) Deeper Delta across genres and languages: do we really need the most frequent words? //Literary and Linguistic Computing, Vol. 26, No 3. Pp. 315-321 [Savoy, 2010] Savoy, J. (2010) Lexical analysis of US political speeches // Journal of Quanti- tative Linguistics, Vol. 17, No. 2. Pp. 123–141 [Savoy, 2015] Authorship Attribution: A Comparative Study of Three Text Corpora and Three Languages //Quantitative Linguistics Vol. 19, Issue 2. Рр. 132-161 10