=Paper=
{{Paper
|id=Vol-2648/paper21
|storemode=property
|title=On the Relationship Between the Five-Factor Personality Model and the Color-Brightness and Statistical Characteristics of Images Published in Social Networks
|pdfUrl=https://ceur-ws.org/Vol-2648/paper21.pdf
|volume=Vol-2648
|authors=Evgenia Kuminskaya,Alexander Talalaev,Vitaly Fralenko,Vyacheslav Khachumov
}}
==On the Relationship Between the Five-Factor Personality Model and the Color-Brightness and Statistical Characteristics of Images Published in Social Networks==
<pdf width="1500px">https://ceur-ws.org/Vol-2648/paper21.pdf</pdf>
<pre>
On the Relationship Between the Five-Factor
Personality Model and the Color-Brightness and
Statistical Characteristics of Images Published in
Social Networks
Evgenia Kuminskayaa , Alexander Talalaevb , Vitaly Fralenkob and
Vyacheslav Khachumovb
a
    Psychological Institute of RAE, Moscow, Russia
b
    Ailamazyan Program Systems Institute of RAS, Veskovo, Russia


                                         Abstract
                                         The purpose of the article is to study the relationship of personality traits with the content of images
                                         posted in social networks. The paper attempts to identify informative features and appropriate ways to
                                         configure artificial neural networks. The developed technique includes obtaining several color-bright-
                                         based and statistical characteristics of image collections in the form of histograms and BoW dictionaries
                                         with further construction of classifiers based on artificial neural networks to test the hypothesis about
                                         the interrelation between the available graphic data of social network users and the five-factor person-
                                         ality model. The questionnaire, which allowed the formation of training and test samples, was carried
                                         out by employees of the Psychological Institute of RAE with the “NEO-FFI” test, which included 60
                                         questions. The collections of images used are datasets that were published by users of the “VKontakte”
                                         social network. The problems of determining personality factors were experimentally solved by using
                                         classifying and predictive artificial neural networks. The work confirmed the prevailing opinion that
                                         there is no significant interrelation (correlation) between placed images and “Big Five” personal factors.
                                         With the help of published images, the factors “Openness” and “Agreeableness” are predicted best, worst
                                         of all – “Neuroticism”. The results of forecasting personality recognition traits improve as the number
                                         of layers of neural networks grows, up to overtraining moment.


1. Introduction
One of the topical researches conducted by psychologists, together with engineers and mathe-
maticians, is related to establishing the connection between personal factors (traits) and graphic
content published in social networks. It is believed that a person can be described by five
traits or the “Big Five” model: openness to experience, intelligence (“Openness” or “O”-factor);
consciousness, self-awareness, integrity (“Consciousness” or “C”-factor); extroversion, vigor,
proneness to contact (“Extraversion” or “E”-factor); goodwill, sweetness, ability to get together

Russian Advances in Artificial Intelligence: selected contributions to the Russian Conference on Artificial intelligence
(RCAI 2020), October 10-16, 2020, Moscow, Russia
" j-aquarius@bk.ru (E. Kuminskaya); arts@arts.botik.ru (A. Talalaev); alarmod85@hotmail.com (V. Fralenko);
vmh48@mail.ru (V. Khachumov)
 0000-0002-9747-3837 (A. Talalaev); 0000-0002-9747-3837 (V. Fralenko)
                                       © 2020 Copyright for this paper by its authors.
                                       Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073       CEUR Workshop Proceedings (CEUR-WS.org)
(“Agreeableness” or “A”-factor); neuroticism, emotional instability, anxiety, low self-esteem
(“Neuroticism” or “N”-factor).
   Foreign sources were the first to mention the existence of such a connection. For exam-
ple, using a convolution artificial neural network (ANN) of the “VGG-Net” model in [Sim15],
we obtained a general evaluation of the connection between personality factors and various
features using Pearson’s correlation. Table 1 presents the results.

Table 1
Correlation between the “Big Five” factors and attributes

                         Features        O       C       E       A       N
                          Colors       0.284   0.352   0.293   0.317   0.398
                        All Images     0.448   0.479   0.369   0.336   0.593
                           Text        0.168   0.059   0.223   0.111   0.261


   From here, it becomes clear that the account of images and their characteristics can make
quite a particular contribution to the prediction of personality traits. In combination, the two
modalities (text, image) provide a more accurate assessment of personality, revealing what may
be lost by individual modality.
   We may refer to studies that have attempted to examine the relationship between a person
and the content of published images. Thus, for example, in [Cha17] studied personal factors in
the context of the bodies of “Twitter” images. It is shown, for example, that users with a high
degree of openness towards experience value art, which manifests itself in the publication
and approval of sketches or images containing musical instruments. The authors of [Cha17]
have analyzed 34,875 pictures of 232 “Twitter” users. In order to assign points to each user in
the evaluation of personal traits of the “Big Five”, an automatic text regression method was
used [Par15]. In paper [Par15], the authors also took into account such features as color and
content of photos. Researches have shown that colors can cause emotions and influence psy-
chological states. The HSV color model (Hue, Saturation, Value) was used to analyze the color
components of images. The authors used various histograms and a standard deviation of HSV
values to predict personal factors. The results of the correlation analysis (according to Pearson)
of the interrelation of features and factors are presented in Table 2.
   Used single-factor correlation tests to identify the relationship between image characteris-
tics and personality. The results of the analysis of the relationship between color components
of images and personal factors made it possible to establish in the most general way the rela-
tionship between the model factors.
   The study [Fer15] attempts to define personality traits based on the fact that users take pho-
tos, share photos, and apply different photo filters to adjust the appearance of the image in the
“Instagram” network. Among the 113 participants and 2,298 extracted photographs, distinc-
tive features (e.g., hue, brightness, saturation) that are associated with personality traits were
found. In the online survey, participants completed the widely used “BigFive Inventory” (BFI)
personal questionnaire and provided access to the content of their “Instagram” accounts. The
results show the relationship between personality traits and how users want their photos to
Table 2
Correlation between factors, colors and image content

                       Factors          O        C         E        A        N
                      Grayscale       0.039   -0.139    -0.128   -0.152    0.262
                     Brightness      -0.108    0.040     0.124    0.027   -0.020
                     Saturation      -0.017    0.023     0.102    0.076   -0.077
                      Pleasure      -0.0017    0.932    -0.079    0.037   -0.024
                       Arousal       -0.007    0.005     0.119    0.048   -0.054
                     Dominance        0.005   -0.013     0.113    0.010   -0.021
                     Hue Count       -0.094    0.040     0.118    0.085   -0.103


look. Descriptors based on the color in the HSV color space were extracted for each image in the
collection. The article obtained descriptive results linking each personality trait with the cor-
responding characteristics of personality factors based on color and brightness characteristics.
Average values of attributes were used to calculate the correlation matrix (see Table 3). Cases
are known when face images are used to predict personality factors. For example, in [Kac20],
an analysis of the relevant studies was made, and experiments were conducted.
   There is no doubt that the results obtained are primary and require further research on
other image bodies while involving in the evaluation of the identity of proven information
technology based on surveys and automated image analysis.


2. Research Objectives
The present work checks the hypothesis about the existence of a significant connection be-
tween features of a five-factor model of the person with color-brightness and statistical char-
acteristics of collections of images. The check is carried out based on data from the social
network “VKontakte” and the results of the questionnaire obtained by psychologists from the
Psychological Institute of RAE (test “NEO-FFI”, 60 questions [Cos92]). The expert data on the
personality model represent five values in the range of 0–48. Image collections are presented as
sets of graphic files containing thematic collections of images published by users in social net-
works. The entire data set, which contains expert information and graphics files of the people
under test, available for analysis, includes information on 1,346 people under test. During the
experiment, 859 representatives of the image collection with three or more files were selected
from this set.
   We wish to solve the following problems.

   1. The first task is to determine personal factors using classifying direct distribution ANN.
      The solution is associated with the construction of a feature space based on histograms of
      the distribution of color and brightness characteristics for each class of images separately;
      ANN training is carried out based on the obtained histograms. An alternative can be
      probability distributions: mathematical expectation, dispersion value, function type are
      determined.
Table 3
Correlation matrix of personal factors and image characteristics

                          Feature             O       C        E      A       N
                            Red             -0.06    0.02    0.17   -0.05    0.03
                           Green             0.17    0.14    0.23   0.03    -0.12
                           Blue             -0.01      0     0.17   0.02    -0.01
                          Yellow             0.01    0.04    0.01   0.14    -0.07
                          Orange            -0.03   -0.07   -0.16   0.02     0.06
                           Violet              0    -0.06   -0.09   -0.07    0.06
                       Bright.mean          -0.25    -0.1   -0.19   -0.07    0.22
                        Bright.var           0.06      0       0    -0.07    0.05
                        Bright.low           0.28    0.09    0.16   -0.05   -0.16
                        Bright.mid          -0.09    0.06    0.04   0.15    -0.06
                        Bright.high          -0.2   -0.12   -0.18   -0.08    0.21
                         Sat.mean            0.16    0.06    0.03   -0.04      0
                          Sat.var.            0.2    0.16    0.19    0.1    -0.05
                          Sat.low           -0.08    0.02    0.02   0.07     0.01
                          Sat.mid            0.08    0.09    0.02   0.07     0.01
                         Sat.high            0.13     0.1    0.04   -0.01    0.01
                           Warm             -0.05   -0.04    -0.2     0      0.03
                           Cold              0.05    0.04     0.2     0      0.03
                         Pleasure           -0.19   -0.08   -0.18   -0.09    0.22
                          Arousal            0.23    0.09     0.1     0     -0.08
                        Dominance            0.28    0.11    0.17   0.05    -0.18
                      Number of faces       -0.16    0.03    0.11   -0.11   -0.03
                     Number of people        0.22   -0.05   -0.07   -0.01    0.07


   2. The second task is to determine personal factors with the help of predictive ANN. The
      solution is related to building BoW dictionaries (Bags Of Words) based on KAZE descrip-
      tors, that been extracted from images in user profiles [Alc12]. Training and test vectors
      formed with the help of such dictionaries, or BoW descriptors, are information records
      indicating the presence/absence of certain visual words in the images of the user. Ap-
      proach description: extraction of a training sample of KAZE descriptors from images,
      their clustering, and formation of bags of visual words; based on the extracted BoW
      descriptors, the multilayer neural network of direct propagation of trait prediction is
      trained. Training is carried out using user profiles of the selected social network.
  The work undoubtedly belongs to the field of artificial intelligence, since it explores the
possibility of creating a technology that allows judging by the posted images about the psy-
chological characteristics of a person. The following factors determine the complexity of both
approaches: 1) nonnumerical data should be converted into numeric data, 2) lack of sufficient
representation by the values of individual attributes. This study supplements the cycle of
works [Kis19, Fra19], in which convolutional neural networks were directly investigated to
solve the task, including “ResNet”, “InceptionV3”, “AlexNet” and “VGG”.
3. The Method of the Classifying ANN
For each of the personal factors, three classes are formed corresponding to the limited ranges of
expert data values: a high score (33–48 points), an average score (21–32 points), and a low score
(0–20 points). Histograms for image collections are constructed in RGB, HSV, and grey shades.
Thus, the data of seven histograms go to the input of neural network classifier. Statistical data
for histogram construction for each tested user of the social network are collected based on all
images presented in his/her collection.
   The neural network classifier is a multilayer perceptron with one hidden layer containing
150 neurons. The input layer contains 1,716 neurons (7 histograms, values ranging from 0 to
1), and the output layer contains three neurons corresponding to the presented classes. The
sigmoid is used as an activation function. An independent classifier is built for each of the
“Big Five” factors during testing. The general structure of the samples database is presented in
Table 4.

Table 4
Overall samples database structure

                      Factor        Number of class representatives
                                Low value Medium value High value
                        O          88            525             246
                        C          376           347             136
                        E          381           357             121
                        A          132           503             224
                        N          140           305             414


   Also, when training neural network classifiers (for each of the factors), the sample of 859
representatives is divided into two non-intersecting parts: the training and test samples. Neu-
ral network qualifiers are trained by backward error propagation for one million iterations.
Further, in Tables 5–9, the test results for each personality factor are presented.
   From the obtained tables, we can see that the recognition results of the training samples
are significantly higher than the corresponding results of the test samples. In general, this
can be explained by 1) the ANN’s ability to memorize and generalize information; 2) a low
correlation between factors and color-brightness representations, which corresponds to the
results [Sim15, Cha17, Par15, Fer15].


4. The Method of the Predicting ANN
From each image of the user with the given threshold of sensitivity (KAZE Response), the own
set of KAZE descriptors is calculated, each descriptor represents a vector with numbers with
a floating point, invariant to rotation, displacement, and changes in lighting. In this study, the
threshold selected is 0.005. Then clustering of KAZE descriptors of the training sample using
the k-average method is performed [Cel13]. In this case, the expected number of clusters at the
Table 5
Classification results for the “O” factor

                                         Training samples
             Sent to ANN          Classification      Total classified      Classified
              input, class            result              samples          correctly, %
                              Low Average High
                Low            27       14       3           44               61.36
               Average         9       249       4          262               95.03
                High           8        16       99         123               80.48
                                                            429               87.41

                                            Test samples
             Sent to ANN           Classification       Total classified    Classified
              input, class             result              samples         correctly, %
                 Low            11       22       11           44             25.00
                Average         35      151       77          263             51.41
                 High           23       69       31          123             25.20
                                                              430             44.88


output of the clustering algorithm is set. distAVG, or the average distance from the descriptors
to their winning clusters, is considered throughout the training sample to cut off the KAZE
descriptors farthest from the selected clusters. The obtained distance is further used to sift out
those descriptors that are far from the clusters, consider them random outliers.
   The remaining image descriptors in the profiles of individual users of the social network
are used to form averaged occurrence vectors of KAZE descriptors on images – generalizing
BoW user descriptors (Bags of visual Words). The size of a BoW descriptor is determined by
the number of clusters used. In the first step of computing BoW descriptors, it is proposed
to write the number of detected KAZE descriptors with a distance to the nearest cluster cen-
ter, not exceeding distAVG, in the form of a vector. In the second step, the obtained vectors
are normalized by dividing by the number of images in user profiles. Based on the obtained
data, an additional vector from the maximum normalized values is fixed. The vector is used
at the final, third step, to obtain values of BoW descriptor vectors within the specified range.
This preprocessing is necessary for normalization of the ANN operation, including by avoiding
zeroes in BoW descriptors of the training sample.
   Training of the feedforward ANN is based on the “Microsoft Cognitive Toolkit” (CNTK)
library version 2.6 [Mic17]. In the experiments, the size of the dictionary varied from 64 to
16,384 items to achieve high accuracy of the predictive ANN on the test samples database. The
choice of the best configurations was made by testing different configurations of the ANN;
it varied: number of layers, number of neurons, and other settings. The activation functions
supported by CNTK [cnto17] and PolyWog wavelet functions [Wan19] were tested. The list of
learning optimizers was limited to the set supported in CNTK [cntl17].
   The first variant of ANN architecture:
Table 6
Classification results for the “C” factor

                                         Training samples
             Sent to ANN          Classification      Total classified     Classified
              input, class            result              samples         correctly, %
                              Low Average High
                Low           174       14       0          188              92.55
               Average         19      154       0          173              89.01
                High           11       3        54          68              79.41
                                                            429              89.04

                                           Test samples
             Sent to ANN          Classification       Total classified    Classified
              input, class            result              samples         correctly, %
                              Low Average High
                Low           102      64        22          188             54.25
               Average         76      68        30          174             39.08
                High           30      27        11           68             16.17
                                                             430             42.09


    • an input layer, 125–1,000 neurons;

    • dropout layer with a 0.01 probability of triggering [Sri14];

    • an output layer with 5 neurons.

  The variant with 2,048 dictionary elements and 250 input layer neurons with ReLU activation
function of activation𝑓 (𝑥) = max(0, 𝑥), for the output layer, 𝑓 (𝑥) = 𝑥 showed the best result;
the ANN was trained by Adam-optimizer [Kin15] with coefficients L1- and L2-regulation 0.0001
and 0.01 respectively (see Table 10).
  Second variant of the ANN architecture:

    • an input layer, 64–1,000 neurons;

    • a dropout layer with a 0.01 probability of triggering;

    • a hidden layer, 50–800 neurons;

    • a dropout layer with a 0.01 probability of triggering;

    • a hidden layer, 35–600 neurons;

    • a dropout layer with a 0.01 probability of triggering;

    • an output layer with 5 neurons.
Table 7
Classification results for the “E” factor

                                          Training samples
             Sent to ANN           Classification      Total classified     Classified
              input, class             result              samples         correctly, %
                               Low Average High
                Low            184       6        0          190              96.84
               Average          11      167       0          178              93.82
                High            10       1        49          60              81.66
                                                             428              93.45

                                            Test samples
             Sent to ANN           Classification       Total classified    Classified
              input, class             result              samples         correctly, %
                               Low Average High
                Low             83      86        22          191             43.45
               Average          90      66        23          179             36.87
                High            22      28        11           61             18.03
                                                              431             37.12


   The variant with 1,536 dictionary elements, the neural network in its layers contained 250,
200 and 150 neurons with ReLU activation function showed the best result; Adam-optimizer
with L1- and L2-regulation coefficients 0.0001 and 0.01 respectively was used, the output layer
also without activation function (see Table 11).
   Summarizing the results obtained, it should be noted that the addition of additional layers
to the ANN improved the standard deviation indicator in the test sample (from 12.4–13.39 to
11.40–12.34). At the same time, the accuracy of the extraction of the factors fluctuates insignifi-
cantly. The optimal number of neurons in the first layer of both networks is 250. In this regard,
it was decided to increase the number of layers, preserving the architectural features of the
ANN and other experimental conditions. The third variant of the ANN contained 400, 350, 300,
250, 200, and 150 neurons in the layers (interspersed with the dropout layers), 1,536 dictionary
elements were used. The number of layers was selected experimentally. The test results are
shown in Table 12.
   From the received tables, we can see that the results of the forecasting of personal features
of recognition improve with the growth of the number of layers of the ANN.


5. Conclusion
In conclusion, in general terms, here are the results of the formulated tasks.

   1. On training samples, the average recognition accuracy of a classification ANN is within
      the range of 87–93%, while on test samples, the recognition accuracy drops to 34–44%.
      It suggests that when training neural network classifiers on the existing data set, we
Table 8
Classification results for the “A” factor

                                          Training samples
             Sent to ANN           Classification      Total classified     Classified
              input, class             result              samples         correctly, %
                               Low Average High
                Low             58       4        4           66              87.87
               Average          6       242       3          251              96.41
                High            9        10       93         112              83.03
                                                             429              91.60

                                            Test samples
             Sent to ANN           Classification       Total classified    Classified
              input, class             result              samples         correctly, %
                               Low Average High
                Low             10       35       21           66             15.15
               Average          55      145       52          252             57.53
                High            21       69       22          112             19.64
                                                              430             41.16


      encounter the effect of the ANN retraining. In this case, the neural network classifier
      functions more as a model of associative memory than a regression model. At the same
      time, it can identify some weak regularities in the processed data. When classifying test
      data, errors associated with the unbalanced training samples are observed. Based on the
      obtained results, it can be concluded that there is no significant connection between the
      color-brightness characteristics of image collections and the “Big Five” factors. Thus, the
      hypothesis that there is a significant relationship between the available graphic data and
      the five-factor personality model of the tested in the first part of the study has not yet
      been confirmed. It should be noted that for more accurate analysis, a significant increase
      in training samples and balancing of classes are required.
   2. To compare the results of the predictive ANN with the previous achievements, let us
      refer to [Kis19]. The results obtained in this paper show that the prediction of personal
      factors by BoW descriptors is inferior to the prediction of direct image processing using
      neural networks in the selection of the least expressed factors. In [Kis19] on the test
      sample, the accuracy of the most clearly expressed factor selection was 0.19–0.21, which
      is consistent with the results of this study. In both works “Openness” and “Agreeable-
      ness” are well predicted, which confirms the previous conclusions about the connection
      of images placed by users of a social network with these personal factors. The poorest
      prediction is that of neuroticism.

   Thus, the following conclusion can be drawn throughout the series of studies carried out
by the authors, including the works [Kis19, Fra19]. In essence, it was possible to confirm the
information about the absence of a significant relationship (correlation) between the placed
Table 9
Classification results for the “N” factor

                                         Training samples
             Sent to ANN          Classification      Total classified     Classified
              input, class            result              samples         correctly, %
                              Low Average High
                Low            61       2         7          70              87.14
               Average         10      134        8         152              88.15
                High           10       4        193        207              93.23
                                                            429              90.44

                                           Test samples
             Sent to ANN          Classification       Total classified    Classified
              input, class            result              samples         correctly, %
                              Low Average High
                Low            21      23        26           70             30.00
               Average         36      43        74          153             28.10
                High           52      69        86          207             41.54
                                                             430             34.88


images and personal factors given in the works [Sim15, Cha17, Par15, Fer15]. It can be partially
explained by the complexity and ambiguity of the interpretation of the graphic content, as well
as the small volume of the original sample. The authors hope to expand the sample further
and conduct a new cycle of research, taking into account not only the color and brightness
characteristics but also the content of published images.


6. Acknowledgements
The work was carried out with partial financial support from the Russian Foundation for Basic
Research (Project No. 182922003-mk).


References
[Sim15] K. Simonyan,        A. Zisserman.               Very deep convolutional net-
        works for large-scale image recognition.                  In:     ICLR-2015, 2015.
        https://www.cc.gatech.edu/∼hays/compvision/lectures/21.pdf,         last    accessed
        2020/03/23.
[Cha17] S. Chandra, W. Lin, J. Carpenter and etc. Studying personality through the content of
        posted and liked images on Twitter. In: ACM Web Science (June 25-28, 2017, Troy, NY,
        USA), 2017. https://www.sas.upenn.edu/∼danielpr/files/persimages17websci.pdf, last
        accessed 2020/03/23.
Table 10
Joint prediction of five factors (first architecture option)

                                              Teaching samples
                                        O         C             E                A             N
            Average standard           0.29      0.32          0.35             0.32          0.27
            deviation values
              for individual
           “Big Five” factors:
            Average standard                  0.31       Average standard              0.28
             deviation value                              deviation value
          by individual factors:                        by individual users:
          Accuracy of selection               1.00      Accuracy of the least          1.00
           of the most clearly                              pronounced
            expressed factor                                   factor
               (from 0 to 1):                              (from 0 to 1):

                                                Test samples
                                        O         C               E              A         N
            Average standard           12.54     14.36          12.40           11.65     15.99
            deviation values
              for individual
           “Big Five” factors:
            Average standard               13.39         Average standard           12.43
             deviation value                              deviation value
          by individual factors:                        by individual users:
          Accuracy of selection               0.24      Accuracy of the least          0.29
           of the most clearly                              pronounced
            expressed factor                                   factor
               (from 0 to 1):                              (from 0 to 1):


[Par15] G. Park, H.A. Schwartz, J.C. Eichstaedt and etc. Automatic personality assess-
         ment through social media language. Journal of Personality and Social Psychology,
         108(6):934–952, 2015. DOI: 10.1037/pspp0000020.
 [Fer15] B. Ferwerda, M. Schedl, M. Tkalcic Predicting personality traits with Instagram pic-
         tures. In: 3𝑟𝑑 Workshop on Emotions and Personality in Personalized Systems, pp.7–10,
         2015.
[Kac20] A. Kachur, E. Osin, D. Davydov and etc. Assessing the Big Five personality traits using
         real-life static facial images. SSRN Electronic Journal, 2020. DOI: 10.2139/ssrn.3567099.
[Cos92] P.T. Costa, R.R. McCrae Normal personality assessment in clinical practice: The NEO
         personality inventory. Psychological Assessment, 4(1):5–13, 1992. DOI: 10.1037/1040-
         3590.4.1.5.
[Alc12] P.F. Alcantarilla, A. Bartoli, A.J. Davison KAZE features. In: European Con-
         ference on Computer Vision (ECCV) (October 2012, Fiorenze, Italy), 2012, 14 p.
         http://www.robesafe.com/personal/pablo.alcantarilla/papers/Alcantarilla12eccv.pdf,
Table 11
Joint prediction of five factors (second architecture option)

                                           Teaching samples
                                        O       C            E                 A             N
             Average standard          0.62    0.56         0.56              0.69          0.51
             deviation values
               for individual
            “Big Five” factors:
             Average standard              0.59         Average standard             0.52
              deviation value                            deviation value
           by individual factors:                      by individual users:
           Accuracy of selection           0.98          Accuracy of the             1.00
            of the most clearly                         least pronounced
             expressed factor                                 factor
                (from 0 to 1):                            (from 0 to 1):

                                              Test samples
                                       O        C                 E            A         N
             Average standard         10.89    13.73            11.11         10.87     15.11
             deviation values
               for individual
            “Big Five” factors:
             Average standard             12.34         Average standard          11.40
              deviation value                            deviation value
           by individual factors:                      by individual users:
           Accuracy of selection           0.21          Accuracy of the             0.31
            of the most clearly                         least pronounced
             expressed factor                                 factor
                (from 0 to 1):                            (from 0 to 1):


          last accessed 2020/03/23.
 [Cel13] M.E. Celebi, H.A. Kingravi, P.A. Vela A comparative study of efficient initialization
          methods for the k-means clustering algorithm. 2013. https://arxiv.org/pdf/1209.1960,
          last accessed 2020/03/23.
[Mic17] Microsoft Cognitive Toolkit. Microsoft Research, 2017. https://www.microsoft.com/en-
          us/cognitive-toolkit/, last accessed 2020/03/23.
[cnto17] cntk.ops package – Python API for CNTK 2.6 documentation. Microsoft Research,
          2017. https://cntk.ai/pythondocs/cntk.ops.html, last accessed 2020/03/23.
[Wan19] L. Wang Software tools using methods and algorithms of multi-scale wavelet analysis
          for image processing and search. Dissertation for the degree of candidate of technical
          sciences, specialty 05.13.11, Moscow, 138 pp., 2019. (In Russian)
[cntl17] cntk.learners package – Python API for CNTK 2.6 documentation. Microsoft Research,
          2017. https://cntk.ai/pythondocs/cntk.learners.html, last accessed 2020/03/23.
  [Sri14] N. Srivastava, G. Hinton, A. Krizhevsky and etc. Dropout: A simple way to prevent
Table 12
Joint prediction of five factors (third architecture option)

                                           Teaching samples
                                         O      C            E                  A         N
              Average standard          1.00 0.77           0.83               0.98      0.68
              deviation values
                for individual
             “Big Five” factors:
              Average standard                 0.85      Average standard             0.75
               deviation value                            deviation value
            by individual factors:                      by individual users:
            Accuracy of selection              0.98       Accuracy of the             0.95
             of the most clearly                         least pronounced
              expressed factor                                 factor
                 (from 0 to 1):                            (from 0 to 1):

                                                Test samples
                                         O        C              E              A        N
              Average standard          8.74     12.64          8.97           9.92     13.85
              deviation values
                for individual
             “Big Five” factors:
              Average standard              10.82        Average standard             9.98
               deviation value                            deviation value
            by individual factors:                      by individual users:
            Accuracy of selection              0.21       Accuracy of the             0.38
             of the most clearly                         least pronounced
              expressed factor                                 factor
                 (from 0 to 1):                            (from 0 to 1):


        neural networks from overfitting. Journal of Machine Learning Research, 15:1929–
        1958, 2014.
[Kin15] D.P. Kingma, J. Ba Adam: A method for stochastic optimization. In: In-
        ternational Conference on Learning Representations (ICLR), 13 pp., 2015.
        https://arxiv.org/pdf/1412.6980.pdf, last accessed 2020/03/23.
[Kis19] N.V. Kiselnikova, E.A. Kuminskaya, A.V. Latyshev and etc. Tools for the analysis of
        the depressed state and personality traits of a person. Program Systems: Theory and
        Applications, 3:129–159, 2019. DOI: 10.25209/2079-3316-2019-10-3-129-159. (In Rus-
        sian)
[Fra19] V. Fralenko, V. Khachumov, M. Khachumov               Correlation analysis and pre-
        diction of personality traits using graphic data collections.         In: SHS Web
        of Conferences. vol. 72. Proceeding of the International Scientific Confer-
        ence “Achievements and Perspectives of Philosophical Studies” (APPSCONF-
        2019), 2019.           DOI: 10.1051/shsconf/20197201012.           https://www.shs-
conferences.org/articles/shsconf/pdf/2019/13/shsconf_appsconf2019_01012.pdf,
last accessed 2020/03/23.

</pre>