Proceedings of the Big data analysis tasks on the supercomputer GOVORUN Workshop (SCG2020) Dubna, Russia, September 16, 2020 ANXIETY LEVEL RESEARCH IN ELDERLY PEOPLE DURING THE COVID-19 PANDEMIC N. M. Zalutskaya1,2, A. I. Smurova3, N. A. Gomzyakova1, Ya. I. Krasnova1, N. L. Shchegoleva3,4 , N. G. Neznanov 1,2 1 V.M. Bekhterev National Medical Research Center for Psychiatry and Neurology, Russian Federation, 192019, St.Petersburg, 3 Bekhtereva str. 2 Pavlov First Saint Petersburg State Medical University,Russia, 197022, St. Petersburg, 6-8, Lev Tolstoy st., 3 Saint Petersburg Electrotechnical University "LETI", 197376, Russia, St. Petersburg, 5 st. Professor Popov 4 Saint Petersburg State University,199034, Russia, St. Petersburg, 7/9 Universitetskaya Emb. E-mail: nzalutskaya@yandex.ru The paper presents data from a study of the psychological state and level of anxiety in 152 elderly people of several megacities of Russia during the COVID-19 pandemic and related restrictions. The data has been obtained through a telephone or online survey. It was found that during the period of restrictive measures associated with coronavirus infection, elderly people showed a wide range of anxiety indicators and options for assessing the situation and behavior in it. To assess the possibility of identifying groups of people with similar characteristics, we used the methods of constructing a dendrogram and "stony talus", the analysis of the results of which showed that it is necessary to use only those signs that are directly related to the objectives of the study, and not all data. The application of the principal component method allowed us to study the data in detail and highlight the most important characteristics for assessing the level of anxiety in the elderly during the COVID-19 pandemic, namely: self-esteem of the patient's mood, general situational ITT, danger of communication and social frustration. These features have been used in the experiments performed. Clusters have been formed using the k-means and ISODATA. The clustering allowed to group the data of the Integrative Anxiety Test and the visual analogue scale of anxiety and well-being into a different number of clusters according to the severity of situational personal anxiety and subjective assessment of one's state in elderly in a situation of an infectious threat. The data obtained allows to conclude that in the event of an infectious threat, a multivariate approach is required not only to providing information about the disease and its prognosis, but also to various options for organizing psychological assistance for the elderly. Keywords: COVID-19, old age, anxiety, well-being, clustering, dendrogram, K-means, ISODATA Natalia Zalutskaya, Anastasia Smurova, Natalia Gomzyakova, Yana Krasnova, Nadezhda Shchegoleva, Nikolay Neznanov Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0) 31 Proceedings of the Big data analysis tasks on the supercomputer GOVORUN Workshop (SCG2020) Dubna, Russia, September 16, 2020 1. Introduction On March 18, 2020 in the Russian Federation came into force the "Decree on ensuring the regime of isolation in order to prevent the spread of COVID-19" [1]. Due to the lack of evidence-based protocols for treating patients and the impossibility of preventive measures due to the lack of a vaccine, contradictory information about the course and prognosis of the disease, self-isolation of older people was considered as the only possible way to avoid a potential threat to life. The task was to study the emotional state of elderly residents of Russian megacities in a situation of infectious threat and related social isolation, as well as their attitude to quarantine measures. These studies were derived from a structured interview and survey conducted between April 20 and May 8, 2020, as well as an integrative anxiety test (IAT) [2]. As of May 8, the terms for the abolition of the self-restraint regime have not been determined. The survey was conducted predominantly by telephone, as the oldest adults do not have Internet use skills, and face-to-face appointments during the self-isolation period for older adults were limited and not recommended. The sample for analysis had 152 completed questionnaires of 67 questions. The main difficulty of processing of the obtained sample is the presence in it of data of different formats: numerical values, single and multiple choices. Before applying the clustering methods, it was necessary to bring all data to one format, while not losing the meaning of the answers. 2. Clustering of the surveyed sample and further interpretation of data The aim of the study was to identify patterns of behavior and condition of older people in a situation of infectious threat. Initially, it is necessary to understand whether the data can be divided into clusters and determine their possible number. The determination of the number of clusters was carried out in two ways: the construction of a dendrogram and scree plot. The dendrogram, built on 67 features, showed that the respondents are significantly different from each other, since the merger points are high relative to previous points. Scree plot, built according to all survey data, showed that the number of clusters varied from 2 to 20. The distance between clusters changes smoothly as the number of clusters increases, which makes it impossible to determine the number of clusters. Plotting does not allow you to determine at least approximately the number of clusters. Fig. 1. Dendrogram, built on 7 features Fig. 2. Scree plot built on 7 features The main purpose of the interview was identification of the level of anxiety and fear of older people, so further analysis was carried out on signs such as self-esteem of the patient's mood, general results of the integrative anxiety test (IAT), assessment of emotional discomfort, danger of communication and social frustration. Figures 1 and 2 show the dendrogram and scree plots for the selected features. 32 Proceedings of the Big data analysis tasks on the supercomputer GOVORUN Workshop (SCG2020) Dubna, Russia, September 16, 2020 On the dendrogram of the part of the data allocated for the study, the fusion height of the two branches decreased, and, therefore, the feature vectors describing the survey participants already at the first stage of construction became closer. In the scree plot one can note a sharp decrease in the distance between clusters at values K from 2 to 5. For the entire data volume (67 features), it is impossible to make an assumption about the number of clusters, since the results contradict each other (dendrogram - 6 or more clusters, scree plot - 2 clusters), while the diagrams for the data selected for the study (7 features) make it possible to explicitly allocate 3-4 clusters. For further research, it is advisable to use only part of the data. 3. Experiments on respondent groups selection 3.1. Experiment 1 Analysis of plotted graphs showed that respondents are divided into 3-4 clusters, but a semantic interpretation of the results is more valid for 4 clusters. The resulting clusters are described below. In the first experiment, the k-means method [3] was used to divide into clusters. When considering significant indicators (p-level≤0.001), it can be determined that cluster 1 respondents were distinguished among the remaining values of the median above the normative by the IAT methodology (from 7 stanines), showed high values of the total indicator of situational anxiety and personal anxiety (Me = 7). Respondents to this cluster regarding concerns about contacts due to coronavirus infection more often chose the answer "I fear, try not to contact people" (57% of the cluster) and, accordingly, in relation to social frustration, they chose the answer more often that they experience a "significant lack of communication" (54% of the cluster). Cluster 2 median values according to the IAT method were consistent with the normative and corresponding to the "optimal" level of anxiety (Me = 4) and personal anxiety (Me = 6), the average self- reported mood was within the norm (63.3 ± 17.1). Regarding the concerns of contact, 51% of respondents chose the answer "moderately fear, keep my distance and use PPE" and 49% "fear, try not to contact people." 51% of cluster 2 respondents did not experience a lack of communication, 35% indicated that it was "insignificant." Cluster 3 medians by IAT indicated low levels of situational anxiety (Me = 2) and emotional discomfort (Me = 2), and measures of personal anxiety (Me = 5) and emotional discomfort (Me=5) were consistent with normative values. Among the respondents to this cluster, the prevailing answer was that in relation to contacts with people due to coronavirus infection, they lack concerns (90% of the cluster). Cluster 4 IAT indicators indicate a low level of situational (Me = 1) and personal anxiety (Me = 3), as well as their components. In relation to contacts, the position of moderate alertness prevailed, 67% of cluster 4 respondents chose the answer "moderately afraid, keeping my distance and using PPE" and 51% denied the presence of a lack of communication. 3.2 Experiment 2 The disadvantage of the k-means algorithm is the need to specify the estimated number of clusters, but this is often difficult or even impossible, which leads to the need to apply it several times with different k values, and then analyze the resulting clusters to determine the most informative distribution. Therefore, in the following experiments, the ISODATA algorithm [4] was used with the number of clusters 5 and the maximum number of combinations 3, that is, from 2 to 5 clusters could be obtained. According to the results of the experiment, the data was divided into 3 clusters. Values of medians of situational (Me = 7) and personal anxiety (Me = 7) by IAT of cluster 1 are higher than normative ones, emotional discomfort, both component of situational anxiety (Me = 5) and personal (Me = 6) anxiety are normal. Among cluster 1 respondents, the answer to concerns about contacts due to coronavirus infection "I fear, try not to contact people" (59.6% of the cluster) prevails, in relation to the lack of communication, respondents of this cluster chose the answer that their social interaction did not change (1.9% of the cluster). Indicators of cluster 2 tests indicate a low level of situational anxiety (Me = 2) and a normal level of personal anxiety (Me = 5), emotional discomfort, both the component of situational anxiety (Me = 4) and personal (Me = 4.5) anxiety in the norm. Cluster 2 respondents were more likely to be dominated by the response of denying the presence of fears of contact with people due 33 Proceedings of the Big data analysis tasks on the supercomputer GOVORUN Workshop (SCG2020) Dubna, Russia, September 16, 2020 to coronavirus infection (93.7% of the cluster). According to tests, cluster 3 respondents record a low level of situational anxiety (Me = 2) and a low level of emotional discomfort (Me = 1), and personal anxiety (Me = 4) and emotional discomfort (Me = 4), as its components correspond to the normative values of IAT. Among cluster 3 respondents, the answer is more common that in relation to contacts with people, they have a moderate fear, they keep their distance and use PPE (61.7%), in relation to social frustration, they were more often to deny a lack of communication (57.3%). 3.3 Experiment 3 For a more detailed study of the data in subsequent experiments, principal component analysis was used [5]. This method is widely used in bioinformatics and psychodiagnostics to reduce the dimension of objects description, for extraction of significant information and data visualization. The results of the study of data within the principal component analysis showed that the most important are the following characteristics: self-esteem of the patient's mood, IAT general situational, danger of communication and social frustration. These features are used in the formation of clusters. Cluster formation was performed using the k-means and ISODATA methods. The third experiment is based on clustering using the k-means method. Figure 5 shows the results of clustering using the k-means method, and below is a description of the respondents to the resulting clusters (X axis - 1 components, Y - 2 components, Z - 3 components, point size - 4 components, color shows the cluster number). The results of the experiment show that the data was divided into 5 clusters. The median of cluster 1 according to the IAT method corresponds to a low level of situational anxiety (Me = 2), emotional discomfort, as its component is also at a low level (Me = 3). Regarding fears of contacts with people due to coronavirus infection, the position of denial prevails (81.8% of the cluster), but at the same time more than half of respondents indicated that they have a significant lack of communication (53.8%), and 41% of respondents reported no lack of communication at all. There were no answers about stopping contacts with people due to fears of coronavirus infection, and that their communication did not change. For cluster 2, the IAT median values fix the normal level of situational anxiety (Me = 6) and emotional discomfort (Me = 6) as its component. The level of personal anxiety (Me = 6) and its component emotional discomfort (Me = 5) correspond to the normal level according to the IAT method. The majority of respondents (61.5%) denied having fears of contact with people due to coronavirus infection, but the majority (69.2%) complained of a significant lack of communication. Fig. 5. Clustering results obtained using the k-means method 34 Proceedings of the Big data analysis tasks on the supercomputer GOVORUN Workshop (SCG2020) Dubna, Russia, September 16, 2020 Median situational anxiety (Me = 1) and emotional discomfort (Me = 1) according to the IAT test, personal anxiety (Me = 3) and its component emotional discomfort (Me = 4) in cluster 3 respondents correspond to low levels of anxiety. The majority of respondents (56% in the cluster) indicated that in relation to contacts with people they have moderate concerns, observe distance and use PPE, 51.2% of respondents reported no feelings of lack of communication. High level of situational anxiety (Me = 8) and emotional discomfort (Me = 7) fixes for respondents from cluster 4 according to the medians of the IAT methodology. High levels of personal anxiety (Me = 8) and personal emotional discomfort (Me = 8) were observed. In this cluster, respondents did not have answers about denying the presence of fears of contacts with people, or about their selectivity, 59.2% of respondents in the cluster reported that they tried to avoid contacts, and 40.7% that they were moderately afraid and took precautions (keep their distance, use PPE). With regard to social frustration, due to lack of communication, 48.1% reported its significant severity, 29.6% - insignificant. In cluster 5, the median values of the IAT technique demonstrate a normal level of situational anxiety (Me = 5) and emotional discomfort (Me = 4) as its component. The level of personal anxiety (Me = 5) and its component "emotional discomfort" (Me = 5) correspond to the normal level according to the IAT method. There is no clear position on the fears of contacts from the COVID-19. About half of respondents (49%) reported no lack of communication. 3.4 Experiment 4 In this experiment, the ISODATA method was used for clustering. As a result, 3 clusters were formed. Figure 6 shows the results (X axis - 1 component, Y axis - 2 component, Z - 3 component, point size - 4 component, color shows the cluster number) and below the response characteristics of the resulting clusters is shown. Median situational anxiety (Me = 2) and emotional discomfort (Me = 3) in cluster 1 respondents correspond to low values of the IAT anxiety level. Personal anxiety (Me = 4) and its component emotional discomfort (Me = 4) correspond to the normative values of IAT. 56% of respondents reported that they were not afraid of contact with people due to coronavirus infection, 29.3% were moderately afraid. According to the frequency of responses about lack of communication, a certain position is not detected among cluster 1 respondents. Cluster 2 respondents record a high level of situational anxiety (Me = 7) according to the IAT method, however, the indicator of situational emotional discomfort (Me = 5) corresponds to normative values. Personal anxiety (Me = 6) and emotional discomfort (Me = 6) correspond to a normal level. Regarding the danger of contacts with people, among cluster respondents, the answers "I fear, I try not to contact people" (48.3%) and "moderately fear, I keep the distance using PPE" (36.6%) prevail. A slight lack of communication in connection with the self-isolation regime was complained by 36.6% of cluster 2 respondents. Median indicators of situational anxiety (Me = 2) and emotional discomfort (Me = 1) in cluster 3 respondents correspond to a low level of IAT anxiety. Values of personal anxiety (Me = 4) and emotional discomfort (Me = 4), as its component, are corresponded to normal level of anxiety. More than half (55%) of cluster 3 respondents are moderately afraid of contact with people and take precautions, 70.5% do not experience a lack of communication during self-isolation. 4. Conclusion Based on the results of the experiments performed, it can be concluded that the methods used for clustering the data of the survey of elderly people during the pandemic COVID-19 allow us to distinguish at least three different groups. At the same time, groups differ in the level of situational anxiety, the degree of emotional anxiety, which in turn determines self-esteem of mood and attitude to contacts between people during the pandemic. However, despite this, almost all survey participants noted a change in the way they communicate, as well as the fact that they have a lack of communication. Since this study is one of the first in this direction, additional experiments are required that will clarify the results, which will make it possible to build a model of a representative of each group, as well as to determine the best way to interact with them. 35 Proceedings of the Big data analysis tasks on the supercomputer GOVORUN Workshop (SCG2020) Dubna, Russia, September 16, 2020 Fig.6. Clustering result obtained using the ISODATA method References [1] Chief State Sanitary Doctor of the Russian Federation, «Resolution on ensuring the isolation regime in order to prevent the spread of COVID-19»2020 https://base.garant.ru/73764449/ (in Russian) [2] Bizyuk, A.P. The use of an integrative anxiety test (ITT): guidelines / N.N. V. M. Bekhterev; authors compilers: A. P. Bizyuk, L. I. Wasserman, B. V. Iovlev. - SPb., 2001 .-- 16 p. (in Russian) [3] MacQueen J. (1967). Some methods for classification and analysis of multivariate observations. In Proc. 5th Berkeley Symp. on Math. Statistics and Probability, pages 281—297. [4] Tou J.T., Gonzalez R.C. (1974) Pattern Recognition Principles. Addison-Wesley Publishing Company, 377 p. [5] Kukharev G.A., Shchegoleva N.L. Face Recognition Systems. – SPb, Publishing SPbSEU “LETI”, 2006, 176 p. (in Russian). 36