=Paper= {{Paper |id=Vol-2552/Paper9 |storemode=property |title=Distribution of Attributes as a Feature of Individual Style |pdfUrl=https://ceur-ws.org/Vol-2552/Paper9.pdf |volume=Vol-2552 |authors=Sergey Andreev }} ==Distribution of Attributes as a Feature of Individual Style== https://ceur-ws.org/Vol-2552/Paper9.pdf
        Distribution of attributes as a feature of
                    individual style∗
                                         Sergey Andreev
                                         smol.an@mail.com
                             Smolensk State University, Smolensk
                                     Russian Federation



                                                 Abstract
            The distribution of two types of attributes (adjectives and nouns in genitive con-
        struction) is studied. Busemann’s coefficient reveals different types of relationship of the
        adjectival and nominal attributes in the texts of 6 Russian female authors. At the same
        time it was found that power function fits well the data irrespective of the peculiarities of
        the authors’ individual style revealing a general order of the distribution of the attribute.
            Keywords: distribution, Busemann’s coefficient, attributes, the power function, in-
        dividual style.




1       Introduction
To analyze individual styles quite a big list of characteristics is used to adequately reveal author’s
speech peculiarities and establish reliable bases to differentiate styles. This list includes a substantial
number of speech properties, both formal and semantic [Juola, 2006; Holms, 1994; Rudman, 1998].
One of such properties whose prognostic value in this respect should be tested is the frequency of
attributes in the texts of different authors in general and of certain attributive types, in particular
[Köhler, Altmann, 2014].
    The syntactic position of an attribute (adnominal) has at least one important peculiarity—it is
not obligatory in verbal syntactic structure and thus is highly optional, depending on the author’s
inclinations and literary taste. On the other hand attributives play a highly important role in
elaborating topics.
    As a result one can suppose that the frequency and the patterns of the distribution of different
types of attributes can serve as an important feature of an author’s style. In other words this can
serve as an explicit feature for comparing and/or discrimination of the styles of different authors.
    According to the part of speech of the word used as an attribute different types attributes can be
established: adjectives (green leaves), pronouns (my friend, this book, and other types of pronouns),
participle (dancing people), infinitive (a wish to win), adverb (a room upstairs) and some others.
One of the most frequent and semantically important is the genitive construction which in Russian
    ∗
    Copyright c 2019 for this paper by its authors. Use permitted under Creative Commons License Attri-
bution 4.0 International (CC BY 4.0)
is formed by a noun in genitive case, corresponding to the English genitive of-construction (the book
of Peter). Such genitive constructions ( N ) reflect the nominal strategy of description, opposing it
to a more standard strategy of the use of adjectives ( A ).
     Here some questions may arise—are the relations between the frequencies of both types ( N
and A ) constant for all the authors and if not-to what extent they differ in different texts? These
questions are viewed on the material of the data-base which includes 6 feminine Russian authors
(Tokareva, Ulitskaya, Tolstaya, Marinina, Ustinova, Polyakova). The choice was motivated by the
following reasons:

    • all the authors are of the same gender;

    • they are very popular in their genres;

    • the genres of their novels are rather different: the first three authors are writers of belle lettres
      style, the last three ones are detective writers.


2     Methods
For the analysis the samples from 3 books by each author were chosen for the analysis. The samples
were of 1000 words length and were taken from the beginning of each book. The list of the novels is
given in the appendix. Adjectival and genitive attributes were marked in the samples.
    To find out the proportions between these two types of attributes Busemann’s coefficient was
used [Altmann, 2015]:
                                                   G
                                           C=           ,                                      (1)
                                                A+G
    where C is Busemann’s coefficient, A stands for all the adjectival attributes, G stands for all
the genitive attributes.
    The coefficient values can vary between 0 (genitive attributes are absent completely) and 1 (no
adjectival attributes were registered). High values of C (C > 0.5) show that G —constructions play
a more important role in description, low values of the coefficient (C < 0.5) indicate the predominance
of A in the style of the author. To test the results chi-square statistic was used [Andreev, Mı́stecký,
Altmann, 2018]:
                                                 (A − G)2
                                           χ2 =             .                                       (2)
                                                   A+G
    Busemann’s coefficient is statistically significant with 1 degree of freedom and p < 0.05 if
χ2 > 3.4 .


3     Results & Discussion
The results of the analysis are shown in Table 1. In all cases the results proved to be statistically
significant. Ranking the values of C in increasing order one can get the following graphical image.
As seen from Figure 1 texts form a gently rising curve. It is noteworthy that in many cases texts
by the same author are positioned close to one another. Thus three novels by Ustinova (13–15) are
placed next to each other and besides are characterized by nearly the same values of the coefficient.
Close to one another are T1 and T3 (Tokareva), T5 and T6 (Ulitskaya), T7 and T9 (Tolstaya), T11
and T12 (Marinina). This demonstrates a comparatively similar relations of two types of attributes
in the works of the same author.


                                                    2
    It should also be noted that the authors of detectives have somewhat lower values of the coefficient.
    The next step was to analyze the relationship of genitives and adjectival attributes in its devel-
opment from the beginning of the samples to the end. For this purpose the number of all adjectival
attributes found before each genitive in the text were counted on a cumulative basis.
    As an example, let us consider the development of the relations of these two attribute types over

                                                   3
the text in two novels: “Skazat’-ne skazat’ ” by Tokareva (T1) and “Moye vtoroye ya” by Polyakova
(T16). The results of the counts in these two texts are represented in table 2. In the first column
the ordinal number of genitive each construction in the text is given. In the second column the
number of all adjectival attributes which come in the text before this given genitive construction are
summarized. The third column contains theoretically expected (according to the formula) frequencies
of adjectival attributes.
     The formula is as follows [Naumann et al., 2012]: y = a ∗ xb , where a and b are parameters.
     The results are shown in Figures 2 and 3 in graphical form. A shown in the figures the observed
frequencies of adjectival attributes (dots) are very near to those theoretically expected, shown as a
full line (curve).
     The values of the parameters a and b are as follows. For T1 a = 4.274 , b = 0.812 ; for T16
a = 1.638 , b = 1.275 . If b < 1 the curve is concave (figure 2), if b > 1 the curve is convex (figure
3) [Naumann et al., 2012: 26–27].
     Table 3 contains the
                         values of the parameters a and b of the power function and the coefficient
                      2
of determination R for all 18 texts.
     R2 for all the novels is very high which proves good fitting. Parameter b showing the increase
or decrease of adjectival attributes towards the end is rater different even in the novels of the same
writer. Only in case (Ulitskayay) all the novels of the same author show the same tendency of

                                                  4
gradually decreasing the number of adjectives over the text as in all fer novels (T4—6) b < 1 . The
biggest increase in the number of adjectival attributes and, correspondingly decrease of the genitives
is seen in T2 and T3 (Tokareva). Vice versa, the largest increase of the number of genitives takes
place in the novel of Ulitskaya (6). Marinina (10—12) demonstrates highly balanced relationship of
adjectival attributes and genitives from beginning to the end of her novels.
    On the whole the analysis revealed the existence of general tendencies as well as certain differences

                                                   5
in style. Different aspects of relations between two main types of attributes in text makes it possible
to estimate the role of different kinds of descriptiveness in an author’s style and can be used as
objective criteria for the differentiation and classification of styles. It should be noted that to get a
more complete picture of such relationship of different types of attributes further steps are needed.




                                                   6
References
[Juola, 2006] Juola, P. (2006) Authorship attribution // Foundations and Trends in Information
     Retrieval. December 2006. Vol. 1. Is. 3. Hanover, MA, USA: Now publishers Inc., 2006. P.
     233–334.

[Holms, 1994] Holmes, D.I. (1994) Authorship attribution // Computers and the Humanities. 1994.
    Vol. 28, No. 2, P. 87–106.

[Rudman, 1998] Rudman, J. (1998) Non-traditional authorship attribution studies in the Historia
    Augusta: Some caveats // Literary and Linguistic Computing. 1998. Vol. 13, No. 3. 1998.
    P.151–157.

[Köhler, Altmann 2014] Köhler R., Altmann G. (2014) Problems in Quantitative Linguistics. Lüden-
     scheid: Ram-Verlag, 2014. – 148 p.

[Altmann, 2015] Altmann G. (2015) Problems in Quantitative Linguistics. 2015. Vol. 5. Lüdenscheid:
    RAM-Verlag.

[Andreev, Mı́stecký, Altmann, 2018] Andreev S., Mı́stecký M., Altmann G. (2018) Sonnets: Quan-
    titative Inquiries. Studies in Quantative Linguistics, 29. Lüdenscheid: RAM-Verlag, 2018. – 130
    p.

[Naumann, et al., 2012] Naumann S., Popescu I.-I., Altmann G. (2012) Aspects of nominal style //
    Glottometrics. 2012. V. 23. P. 23–55.




                                                 7