<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Weak Genres: Modeling Association Between Poetic Meter and Meaning in Russian Poetry</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Roman Leibov</string-name>
          <email>roman.leibov@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Artjoms Šeļa</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Higher School of Economics</institution>
          ,
          <addr-line>ul. Miasnitskaia 20, 101000 Moscow</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Institute of Polish Language (Polish Academy of Sciences)</institution>
          ,
          <addr-line>al. Mickiewicza 31, 31-120 Kraków</addr-line>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>University of Tartu</institution>
          ,
          <addr-line>Ülikooli 18, 50090 Tartu</addr-line>
          ,
          <country country="EE">Estonia</country>
        </aff>
      </contrib-group>
      <fpage>12</fpage>
      <lpage>31</lpage>
      <abstract>
        <p>This paper aims to formalize an established theory in versification studies known as ”semantic halo of a meter” which states that diferent metrical forms in modern poetry accumulate and retain distinct semantic associations. We use LDA topic modeling on a large-scale corpus of Russian poetry (1750-1950) to represent each poem in one topic space and then proceed to represent each meter as a distribution of aggregated topic probabilities. Using unsupervised classification and extensive sampling we show that robust form-meaning associations are present both within and between metrical forms: two samples of the same meter tend to appear most similar, while two metrical forms of the same family tend to group together. This efect is present if corpus is controlled for chronology and is not an artifact of population size. We argue that similar approach could be used to align and compare semantic halos across languages and traditions to give meaningful general-level answers to questions of literary history.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;poetry</kwd>
        <kwd>semantics</kwd>
        <kwd>meters</kwd>
        <kwd>topic modeling</kwd>
        <kwd>clustering</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        rhythm shaped meter’s semantics [
        <xref ref-type="bibr" rid="ref24 ref47">24, 49</xref>
        ]. Based on the close reading of thousands of 19th
century poems Mikhail Gasparov demonstrated that the connection should be historical and
is determined by a meter’s origins in a local tradition and usage over time that accumulates a
distributed, yet distinct semantic profile [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>
        Despite the attractiveness of the findings, lack of formalization makes the semantic halo a
target easy to criticize and hard to defend. Even if some specific ”halos” are not a product of
a simple sampling error, any generalizations about the mechanism itself and the structure of
relationships between metrical forms remain elusive. A few previous empirical attempts to
approach meter-meaning association in Russian [
        <xref ref-type="bibr" rid="ref33">35</xref>
        ] and Bashkir [
        <xref ref-type="bibr" rid="ref31">33</xref>
        ] poetry were able to broadly
confirm lexical diferences between metrical forms while relying on wordlists comparison, which
provides us an entry point to the problem.
      </p>
      <p>This study tries to address the presence of the semantic halo in Russian poetry using a
set of abstracted semantic features (topics) that describe each individual poem in a uniform
way. Having all texts aligned within one model allows for performing flexible tests and using
classification algorithms to explicate and verify scholarly assumptions. We rely on hierarchical
clustering to assess the level of within-meter semantic similarities (are meters similar to
themselves) and between-meter relationships (how metrical forms relate to each other). Following
the analysis we discuss how formalizing the semantic halo of meter could enhance our
understanding of it as a mechanism of cultural transmission and how a similar approach could be
used to study the halo efect across various languages and traditions.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Corpus</title>
      <p>
        Data used in this study comes from Poetry Sub-collection of the Russian National Corpus [
        <xref ref-type="bibr" rid="ref38">40</xref>
        ]
that includes texts spanning from the 18th to the late 20th century. It roughly covers the
whole history of modern Russian versification that started with the introduction of German
accentual-syllabic verse in 1730s. The corpus has a clear canonicity bias in its design: 18th and
19th century texts were included in the collection based on their availability in the 20th century
scholarly editions [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. This leaves a lot of earlier poetic production outside of academic canon
unaccounted for and partially drives the inequality in chronological distribution of the poems:
more than 75% of texts come from the 20th century. This is also a very non-uniform pool of
texts because starting with 1917 Russian poetry split in three generally isolated traditions –
Soviet, emigrée and unofficial underground. Having no automated way to separate them, we
limit the corpus by the year of 1950, which roughly excludes most of the underground works
and stops the timer before the noticeable drift towards the non-classical versification begins.
After all subsetting operations and preprocessing steps (see Section 3) we are left with 47,804
texts (2,275,233 words).
      </p>
      <p>
        This study is mainly focused on the accentual-syllabic (AS) metrical (and usually rhymed)
poetry, which survived in the Russian versification much longer than in the Western traditions
that turned to verse libre [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. AS systems of versification are based on strict limitations
for both the number of stresses and the number of syllables in a line, as compared to purely
accentual (only stress count matters) or purely syllabic (only syllable count matters). The AS
meters are built of recurring smaller units of rhythm – feet that organize stressed and unstressed
syllables in patterns, usually of two or three (binary or ternary feet). Since metrical scheme
is an abstraction of a poetic rhythm and is constantly altered (expected stressed positions left
unstressed and vice versa), we usually speak of strong vs. weak positions in a meter, instead
of ”stressed” or ”unstressed”. Table B.3 provides a summary of all the classical AS forms that
were used in this study. The exception are so-called ”dolniks”, which step away from AS by
loosening rules for syllable count, but their abundance in the 20th century cannot be ignored.
      </p>
      <p>
        We utilize the existing corpus metadata that includes annotations for poetic form to, if
possible, label each poem with a single unambiguous metrical formula. Corpus annotation
was done institutionally under the supervision of experts in linguistics and prosody, however,
annotators’ agreement or error rate was not reported [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. We expect the accuracy to be
very high, especially in classic AS forms that are easy distinguishable even with the minimal
training. We asked three literary scholars to verify 100 original corpus annotations for metrical
forms: on average, they marked 97.7% labels as ”true”. Mean inter-annotator agreement was
96.6% (low agreement on what to consider ”false” labels).
      </p>
      <p>We were conservative in labeling texts with corpus metadata, preferring homogeneous
metrical notations and excluding most of the complex cases of polymetry, logaeds or other
heterogeneous forms. We also used simplified information on stanza, relying on just a general
clausula pattern. Throughout this paper we use a metrical notation derived from the Russian
school of metrics, e.g. Iamb-4-fm stands for Iambic Tetrameter with regularly alternating lines
of feminine and masculine clausula (or acatalectic and catalectic lines).</p>
      <p>For the purposes of this study we infer three levels of metrical expression from a single
metrical formula:
1. A general family of metrical pattern (e.g., Trochee, a meter based on binary feet with
the strong position on the first syllable);
2. A meter of the poem based on the number of feet (e.g., Trochee-5, Trochaic Pentameter
composed of five trochaic feet)
3. A catalectic variant of the meter that describes the pattern of non-stressed syllables after
the last stressed one (e.g., Trochee-5-fm; f – stands for feminine (Xu), m – masculine (X),
d – dactylic (Xuu) ending of a line).</p>
    </sec>
    <sec id="sec-3">
      <title>3. Modeling semantics</title>
      <p>
        We aim to model meter-meaning association through the semantic features of individual poems.
To do so, we train one Latent Dirichlet Allocation (LDA) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] topic model on the whole corpus,
without any aggregation of poems, writing metrical labels and other metadata in document
names.
      </p>
      <p>
        Topic models is a collective name for a large family of information extraction algorithms that
look for groups of co-occurring elements in a collection of documents. These groups are labelled
topics (the original goal was text mining), but models are transferable to, e.g. molecules [
        <xref ref-type="bibr" rid="ref53">55</xref>
        ],
music [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ] and genes [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], or any task that requires to abstract groups of similar behaviour from
large number of features (words, chords, genes, chemical elements, etc.). Topic modeling is
now widely used for text mining and classification in humanities and social sciences [
        <xref ref-type="bibr" rid="ref20 ref40 ref54 ref8">20, 56, 8,
42</xref>
        ]; it was also shown multiple times that LDA is applicable to the corpora of smaller poetic
texts [
        <xref ref-type="bibr" rid="ref1 ref19 ref29 ref34">1, 31, 19, 36</xref>
        ].
      </p>
      <p>
        Topic models were promisingly used for modeling general questions of cultural history: rate
of change in popular music [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ], modes of scientific exploration of information [
        <xref ref-type="bibr" rid="ref28">30</xref>
        ], or innovation
and retention in historical political discourse [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. In these cases topical representation of entities
served as a mere approximation of ”contents”. We aim for a similar abstracted representation
of poetic language, very vaguely mimicking scholars who operated high-order semantic labels
to describe meanings specific to meters like Night, Road or Death (themes that, according to
Gasparov, collectively express some of the main semantic directions of Trochaic Pentameter in
Russian poetry [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]).
      </p>
      <p>
        LDA is a generative probabilistic model that is based on a few very important assumptions:
1) each text in the collection is assumed to be generated from k number of topics; 2) each
topic is a probability distribution over all available features (where most of the features are very
unlikely). LDA represents each document as probability distribution over all k topics, so that
all documents could essentially be described by the equal-sized vectors in one ”topic space”. In
other words, LDA tries to infer a specific number of groups of co-occuring words from a corpus
automatically; as a consequence, each document becomes represented as combination of these
groups. We consider the use of topic models crucial to our goals, because 1) LDA allows to
do uniform semantic abstraction on the level of single poems; 2) it expresses each document
with potentially low number of interpretable dimensions; 3) topic probabilities of poems allow
for a straightforward follow-up analysis; 4) topic models make our approach independent from
language and specific domain expertise.
We followed several corpus preprocessing steps before training a model:
1. All texts were lemmatized using mystem 3.1 [
        <xref ref-type="bibr" rid="ref41">43</xref>
        ];
2. General-purpose stop-word list was applied to the corpus (removing conjunctions,
particles, prepositions, pronouns and numerals);
3. We wanted to reduce lexical variance of the corpus, taking only 5,000 most frequent words
to build a model. LDA output usually improves with less sparse data, so removing rare
words is a common procedure. However, since our goal was semantic simplification of
poetic language we used a separate word embedding model trained on the same corpus to
replace words outside of this ”core” 5,000 (word2vec implementation via gensim Python
library, vector size=300). A word was replaced if it had a semantic neighbour among
its 10 contextually most similar words (measured in cosine similarity of corresponding
vectors) that was also a member of the top-1000 words. This procedure allowed us to
replace words with their hyponyms, more frequent synonyms or grammatical variants
(e.g. replacing diminutive forms) and, in some cases, to explicate traditional metonymy
of poetic language (e.g. replacing ”Pontus” with ”ocean”). The procedure was not perfect
and introduced some noise, which however did not have noticeable efect on the model.
We also note here that our results do not alter radically if this contextual replacement
does not happen or if we use another limit on most frequent words (e.g. 1,000). Despite
insignificant efects, we still report results for data with contextual replacements, since
we believe that the chosen direction towards the semantic abstraction is important and
should be improved in the future. We provide all main results for the unaltered corpus
in the Appendix (Table B.5).
4. The corpus was also limited by text size to introduce the LDA with at least comparable
range of word distributions in a document. We removed extra-small (less than 4 lines)
and extra-large (more than 100 lines) poems which left us with approximately 95% of
total texts. We further trimmed the corpus based on word counts, leaving out the texts
between .10 and .90 percentile of size distribution (between 20 &amp; 102 words, which
approx. corresponds to 12 &amp; 50 lines poems when accounted for stop-words removal).
These limitations mean that our model primarily is focused on short lyrical poetry (a
dominant form in Russian tradition that experienced rapid shrinkage of mean poem
length [
        <xref ref-type="bibr" rid="ref43">45</xref>
        ]). We believe however, that whatever results we have should also apply to
long narrative poetry, where semantic traditions of metrical usage were predicted to be
much more pronounced [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Final text count after all operations is 47,804 (of which
39,220 texts have a single label for their form, derived from the corpus annotations).
      </p>
      <p>
        There is no universally recognized way to determine an optimal number of topics for the
model [
        <xref ref-type="bibr" rid="ref39">41</xref>
        ]: in this paper we report results for LDA trained with 80 topics, which was a midpoint
model in a trade-of between topic coherence (log-likelihood) and perplexity (”surprise” of the
model when predicting unseen data). Main study procedures were also applied to a range of
LDA models with variable number of topics (from 10 to 200) that showed a robust performance
overall (Table B.5 in the Appendix). We set LDA priors to alpha=0.1 (we do not expect many
topics generating single text, since we do not want to swamp distribution) and beta=0.3 (we
do not expect too many words contributing to a topic, but some).
      </p>
      <p>To perform a quick sanity check of the final model we can look at the distinctive topics in a
few meters that were described before qualitatively (Table B.4 in the Appendix). While some
topics could be seen as compatible with assumed semantic halo of meters, there is, of course, no
direct relationship. Topics do not correspond to the abstracted metrical ”themes” (Gasparov
also did not use them systematically across his descriptions of diferent meters) but still they
appear interpretable and could be used for our purposes of the distributed representation of a
meter’s ”content”.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Tracing the halo</title>
      <sec id="sec-4-1">
        <title>4.1. Within-meter similarities</title>
        <p>The theory of ”semantic halo of meter” assumes that meaning is non-randomly distributed
across metrical forms, that each meter historically builds a unique semantic valency. The
theory also (somewhat implicitly) considers the halo efect cumulative: we would not be able to
reconstruct a meter’s semantics looking at an isolated poem, but a distinct pattern will emerge
from much larger sample of meter’s usage in a tradition. We can rephrase these premises to
say that meter-meaning association assumes some kind of self-similarity within a poetic form.
If the halo efect exists, then two independent pools of poems coming from the same meter
should appear semantically closer to each other than to other samples coming from diferent
meters.</p>
        <p>
          Let us say we are an observer of the whole tradition, looking at the metrical halos from the
year of 1950 (our corpus upper chronological boundary). To test if the meaning-meter
association is noticeable on a general level we perform unsupervised classification on two random
samples (without replacement) of 200 poems for each meter that has at least 500 poems. Per
each sample we calculate mean topic probabilities to represent aggregated topic distribution
within a meter. Since we are dealing with probability distributions, we proceed to calculate
Jensen-Shannon divergence (symmetrical Kullback-Leibler divergence [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ]) between all
samples and build hierarchical clusters from the resulting distances. Resampling and recalculation
then continues for 100 times. This results in 100 dendrograms with clustering information that
is used to build a ”majority-rule” consensus tree [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]: it draws branches that correspond to
50% agreement across all dendrograms, so that two branches will not be connected if they did
not cluster together at least in half of the trees (Fig. 2a).
        </p>
        <p>The same procedure could be applied at the level of metrical variants. Because of data
sparsity and noise in metrical annotation, we use only variants of Iamb-4 that have at least
200 poems, while removing the most frequent variant for its difused semantics (Iamb-4-fm).
This leaves us with only four forms of Iamb-4 (Fig. 2b).</p>
        <p>Without addressing further complications of this approach, it is clear that within-meter
semantic similarities are present in the corpus. It is also apparent that semantic diference
could be traced in specific metrical variations, although this level of detail will require much
better annotations and stanza information. Topic information alone is enough to consistently
group two arguably large samples coming from the same meter together (if the median size
of a poem in our corpus is 50 words, then meter-meaning association is quite pronounced in
samples of 10,000 words). We can check the ”cumulative” efect of metrical halo by looking
at how the performance of hierarchical clustering change with the sample size.</p>
        <p>
          To evaluate unsupervised classification we use two metrics: simple Cluster Purity (CP) [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]
(sum of cluster matches divided by number of individual samples) and Adjusted Rand Index
(ARI) [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ] that are designed to compare two classifications with the latter also accounting for
classification by chance (returning values around 0). We use the same threshold of 500 available
poems per meter and the same procedure on the increasing sample sizes (up to 250 poems),
calculating CP and ARI for each case of clustering. As expected, clustering accuracy grows
with the increasing number of poems in a sample, up to the median ARI=0.73 and CP=0.90
(Fig. 2c). However, it is important that non-random clustering can be noticed early and some
semantic patterns in meters are recognized at 20-40 poems per sample.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Between-meter similarities</title>
        <p>
          Consensus tree on Fig. 2a also hints at the overarching semantic relationships between
particular metrical forms. Some iambic meters tend to cluster together, the same happens with
ternary forms that clearly make one group (with low distinction between Amphibrach forms).
We do not really have a ”ground truth” for how poetic forms should relate to each other
semantically, except for some observations of their historical usage, similar origins, etc. However, we
may expect that the transmission of meaning is at least partly bounded by metrical families
(i.e. Iambic or Trochaic), since they cast broadly similar rhythmic and grammatical
boundaries on language [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. In some cases meters of one family also share historical connections.
For example, most Iambic meters were initially used in high-prestige genres: Iamb-4 in ode,
Iamb-5 in drama, Iamb-6 in elegy. At the same time, Trochaic forms were often perceived as
a counterpart to elite iambic verse; some of their rhythmical features overlapped with an oral
tradition which defined their use as folklore imitations and set up corresponding association
range.
        </p>
        <p>We design a fairly conservative experiment to address the assumed ”family influence”: since
the number of available meters per family is not equal, we consider only those families that
have at least two populated meters (&gt; 400 poems). We take two meters per family 20 times at
random; in each set of meters every meter is represented by 300 random poems; clusters are
calculated the same way as described in Section 4.1 , except that now we verify clustering of
forms against their family ”ground truth” (Iamb, Trochee, etc.). The process is repeated 100
times for each of 20 sets of meters. We report the distribution of mean ARI and CP values
per instance of sampled meters as compared to randomly assigned clusters (Fig. 3a).</p>
        <p>While the resulting clustering evaluation may not seem high (median ARI is around 0.44,
CP – 0.76), these values are enough to confirm that, at least to some extent, between-meter
relationships are driven by the metrical families. To better demonstrate this efect, we
calculated a consensus tree out of 100 clustering results without any restriction on number of meters
per metrical family (Fig. 3b. Iambs, Dactyls, Anapests and some Trochees tend to consistently
cluster together; semantics of ternary forms remain somewhat difused, but they still form one
cluster with each other.</p>
        <p>
          Cases of ”wrong attribution” are potentially informative and align with the scholarly work
on the subject. The similarity of Iamb-3 and Trochee-4 is well-known [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]: they both originate
in the 18th century’s anacreontics and share ”song” semantics across many variants. Iamb-43
with regular alternation of lines of diferent feet count also grew from a lyrical song and a
ballad and was associated with lyro-epic poetry.
        </p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Semantic halo in time</title>
        <p>It is time to abandon the position of an observer who is ”looking back” at the whole tradition.
Fig. 2 and Fig. 3 make it clear that clustering is to some extent enhanced by the diference in
meters’ distribution in time. Since LDA algorithm exploits patterns of word co-occurrence in
documents, it naturally ends up with groups of words coming from diferent times (e.g. ”folk”
topic, ”Soviet” topic or naturalistic war topic). In turn, this drives the divergence calculations:
Soviet topic has close to zero probability in 18-19th century texts. We can see this from Dolnik
and Trochee-5 consistently appearing in one cluster – two very popular 20th century forms,
both rare earlier.</p>
        <p>”Free” Iamb could be a good example here. This form was composed of unregulated
alternation of iambic lines of varying feet length and almost exclusively used for specific set of
genres: poetic epistolary, fables and epigrams, It was also entirely abandoned after 1850s. In
all 1,200 poems of free Iamb there are only two topics that account together for 20%
probability (48,animals &amp; 46,communication). Despite its name, this meter is frozen in time with
a combination of genres imprinted on it, two samples of which is easy to cluster together. In
short, our semantic abstraction is not abstract enough to disregard chronological diferences.</p>
        <p>The time, however, does not invalidate the general presence of metrical halo; after all,
asynchronous development and fashion fluctuations of metrical forms shape their perceived
diferences and pin their associations to distinct periods (Iamb-4-fm to the ”Golden Age”
of 1820-1830s, Dactyl-3 to civic and political sentiment of 1850-1880s, Dolnik to modernism
poetry). Controlling the corpus for chronology is useful not only for testing the halo efect
on a smaller scale, it also creates opportunities to examine the meter-meaning association
mechanism behavior in time. We only briefly address it here, since it is a separate problem
that deserves its own experimental design and comparative data.</p>
        <p>First, we want to see if within-meter semantic similarities are present if all poems samples
are coming from the same time frame. We divide our corpus to 30-year bins (excluding the 18th
century because of low variety in popular meters). For each time frame we take its six most
frequent meters and run 100 iterations of clustering, taking two random samples of poems per
meter and report mean ARI values (Table 1). These values are not directly comparable since
we use diferent pools of meters and sample sizes (half of the minimum meter count per period,
lfoored), but they are enough to point out that halo efect could be traced within limited time
frames and, in fact, in samples much smaller than expected (see Fig. 2) for some periods.</p>
        <p>Second, it is possible to utilize chronological information to ask questions about halo behavior
and semantic accumulation. If semantic halo of a meter is taken as a historical phenomenon and
not as an organic one, we expect a non-uniform, decreasing strength of association between
meter and meaning across time. Specifically, we expect to find a discrepancy in clustering
evaluation between the early 19th century poetry and the second half of the century. This would
confirm a historical difusion of metrical semantics and unchaining the connection between
form &amp; genre imposed on poetry by normative aesthetics. Since our approach to classification
depends on sample size, we simply split 19th century data in two halves and observe distribution
of ARI values when clustering is performed on the strictly the same set of meters and sample
sizes for each of the two chronological groups.</p>
        <p>The diference between two periods turns out to be significant (Table 2). Samples of 40
poems – the maximum possible sample size to use for two groups – were better distinguishing
meters in the first half of the 19th century, signaling a focused metrical usage. If we increase
sample size for the second half of the 19th century, then mean clustering accuracy predictably
goes up (sample size=100, ARI=0.43), which points to the process of semantic accumulation
in meters – they become more difused semantically, but not unrecognizable.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Topic expression and frequency of a meter</title>
        <p>From the start of this paper there was one daunting question: what if the whole halo efect
arises from the simple fact that all metrical forms difer in popularity? It is easy to draw any
kind of topic configuration from Iamb-4, less so from Iamb-3, while options for Trochee-3-dm
are almost non-existent. The semantic halo thus may be born from a combination of sampling
error, chronological diferences and a confirmation bias in scholars reading same-meter poems
similarly.</p>
        <p>It becomes more complicated, since scholars tell us to expect that the distinctiveness of
a halo is naturally reducing with the increase in a frequency of a meter. There is just no
space for semantic variation in rare forms. Thus we assume that 1) the relationship between
the strength of expression of a meter and its frequency should be linear; 2) if semantics are
randomly assigned to meters a trend should be diferent.</p>
        <p>To measure how ”distinct” is semantics within a single meter we can utilize the curves of
topic probabilities distribution and look at them from the perspective of ”inequality”. In less
populated meters we expect to find fewer topics that dominate the distribution than in highly
frequent forms. Fig. 4a shows an example: topic probabilities are aggregated from all poems for
each of the two metrical variants and arranged according to their overall contribution to meter
semantics. For each of the resulting curves a Gini coefficient – originally designed for measuring
national inequality in wealth distribution – could be calculated. Gini takes a value of 1 when a
distribution is demonically unequal (a single topic is 100% probable) and 0 when it is perfectly
equal (each of the 80 topics is 1.25% probable). It is apparent that this coefficient is enough
to capture how focused a meter’s semantics is at least relatively, while Gini’s absolute values
would be influenced by LDA priors (0.1 alpha assumes more inequality in topic probabilities
of a poem than 0.5 alpha).</p>
        <p>To verify the difusion of a halo and its (non-)random nature we first calculate Gini
coefifcients for each metrical variant which was used in corpus at least 10 times. Then the same
calculation is done when semantics are redistributed – per each frequency n of a real metrical
form we randomly sample (without replacement) n poems from the corpus. In the end all
poems end up being randomly reassigned to empty ”meter bins” and this redistribution is done
20 separate times. If the halo could not be simply sampled at random, then some divergence in
how inequality correlates with sample size between two groups of points should be noticeable.</p>
        <p>Fig. 4b shows diferences between inequality distributions in randomly aggregated poems
and actual metrical forms on log-log scales. A decrease of semantic inequality based on a
meter’s frequency happens as expected with very telling outliers that signal concentrated halo
in frequent metrical forms (most notably free Iamb). On the other hand, inequality in randomly
redistributed data decays quicker, leaving lots of actual meters above the line. This suggests
that even if comparable levels of topical inequality in the very infrequent forms might occur
randomly, there is no reason to expect it for semantic halo in general. It is highly improbable
to randomly sample semantic curve even of Iamb-4-fm which was always considered a neutral
and a general-use form.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Discussion</title>
      <p>We were able to show that topical information alone could be used to recognize a poetic form.
Poems originating in one meter retain semantic similarity to themselves. Separate meters
demonstrate stable relationships between them which in part are driven by their origins in
metrical family. Historical diferences in classification accuracy also suggested semantic
accumulation in metrical forms and a difusion of meter’s ”meaning” over time without swamping
it beyond recognition. These findings, we believe, confirm the theory of semantic halo and its
main assumptions at least on a general level.</p>
      <p>
        In the future the efect of metrical halo could be better understood in cultural evolution
framework [
        <xref ref-type="bibr" rid="ref46">29, 48</xref>
        ], which provides plenty of options to explicate the way of how we think
about the workings of historical processes and cultural transmission. Cultural evolution is an
emergent field that studies variation, retention and difusion of cultural information (which is
usually defined as any information acquired via social learning). This framework encompasses
the research across various disciplines and domains: it was used to reconstruct cultural
phylogenies of archaeological artifacts [
        <xref ref-type="bibr" rid="ref30">32</xref>
        ], folktales [
        <xref ref-type="bibr" rid="ref45">47</xref>
        ] and medieval manuscripts [
        <xref ref-type="bibr" rid="ref2 ref57">2, 59</xref>
        ]; understand
how people learn and contribute to cumulative forces of culture [
        <xref ref-type="bibr" rid="ref50">52</xref>
        ]; address innovation rates
in popular music [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]; study macro patterns in language evolution [
        <xref ref-type="bibr" rid="ref15 ref6">15, 6</xref>
        ], or the efects of
population size on the difusion and retention of cultural information [
        <xref ref-type="bibr" rid="ref22 ref35">22, 37</xref>
        ].
      </p>
      <p>It would not be too far-fetched to say that all new poems originate from previous ones.
Most often they are products of imitation: it is extremely rare for a poet to completely escape
any exposure to a tradition or single-handedly create a whole versification system. Should
it happen, chances are high that these individual eforts will not survive for too long, simply
because there will not be enough followers. Poetic forms are persistent and conservative: things
like Iambic Pentameter, or rhyming, or a sonnet pattern could survive for centuries. This means
new poems share plenty of formal features – like meter – with their predecessors, efectively
replicating the previously used form. Arguably nothing should stop a poet in a modern highly
individualized tradition of lyrical self-expression from using a metrical form absolutely free,
independent from its semantic inertia. We see this is not the case.</p>
      <p>
        Meters and poetic forms could be seen as behaving similarly to ”transmission isolating
mechanisms” [
        <xref ref-type="bibr" rid="ref10 ref49">10, 51</xref>
        ] of culture. These mechanisms are certain conditions (marriage traditions,
household organization, etc.) that maintain some level of ”vertical” (parents to ofspring)
transmission of information in culture which is usually seen as the domain of extensive
”horizontal” connections (peers to peers). Meters, similarly, are limiting the semantic possibilities
of poems and pushing meaning creation into fuzzy, yet distinct pathways. This ensures that
meters during the modern poetic histories act as ”weak genres”, replicating an expanding set
of features in poems that also share similar formal origins.
      </p>
      <p>
        Why this formal isolation should happen in poetry at all? An obvious answer lies in meter’s
capacity as an efective mnemonic device that pushes language into a high-order pattern and
enhances memorability [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Poetic forms originated in oral traditions, which, in turn, relied
on many formal mechanisms (meter, rhyme, formulaic language, formulaic plots, etc.) that
limited options for how a text could be made and retold to facilitate its memorization and
transmission [
        <xref ref-type="bibr" rid="ref37">39</xref>
        ]. Apparently the mnemonic strength of a meter would matter in written
traditions too. It is not just a form that is remembered and reproduced; by simply turning its
wheels a meter carries its ever-expanding baggage further. The individual histories of metrical
forms in Russian poetry had a lot of twists and turns and consciously guided revolutions in
metrical expectations (because these expectations existed). However, it seems that no one
could truly escape the mnemonic tyranny of the accentual-syllabic verse.
      </p>
      <p>We expect the semantic halo to be present (to larger or smaller extent) in any poetic
tradition based on any versification system that allows for distinct and stable poetic forms across
time. Topic models could give us a set of abstract derived measures (classification accuracy,
inequality, etc.) that could be compared across languages and traditions. This, in turn,
provides an access to general questions of literary history: how poetic genres ”fall” in time, to
what extent meters continue to be recognizable (if at all), how the invention of new forms
happen or what is the role of individual poets and poems in shaping semantic halo.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>AŠ was funded by “Large-Scale Text Analysis and Methodological Foundations of
Computational Stylistics”(NCN 2017/26/E/ HS2/01019) project supported by Polish National Science
Centre. We would like to thank two anonymous reviewers for their attentive reading that
allowed us to make this paper better. We thank Joanna Byszuk, Maciej Eder, Antonina
Martynenko, Vera Polilova and Oleg Sobchuk for all of their contribution, help and support.
[28] A. Meillet. Les origines indo-européennes des mètres grecs. Paris, Les Presses
universitaires de France, 1923. url: http : / / archive . org / details / lesoriginesindoe00meiluoft
(visited on 07/26/2020).
[29] A. Mesoudi. Cultural Evolution: How Darwinian Theory Can Explain Human Culture
and Synthesize the Social Sciences. University of Chicago Press, 2011.</p>
    </sec>
    <sec id="sec-7">
      <title>A. Code and data availability</title>
      <p>
        Document-Term Matrix, final preprocessing steps, final models &amp; full analysis are openly
available: https://github.com/perechen/semantic_halo_rus We have used R 4.0.2. [
        <xref ref-type="bibr" rid="ref36">38</xref>
        ] for
the analysis, LDA implementation in topicmodels package [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], relied on tidytext wrappings
for the model’s output [
        <xref ref-type="bibr" rid="ref44">46</xref>
        ], phylentropy [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and ineq [
        <xref ref-type="bibr" rid="ref56">58</xref>
        ] for calculations, ggtree for drawing
trees [
        <xref ref-type="bibr" rid="ref55">57</xref>
        ], patchwork for assembling plots [
        <xref ref-type="bibr" rid="ref32">34</xref>
        ]. ghibli package [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] provided the color palette
”MononokeMedium” for the plots.
      </p>
    </sec>
    <sec id="sec-8">
      <title>B. Appendix</title>
      <p>Table B.3
All major accentual-syllabic meters used in the study. 1 - denotes strong position in a foot (stress expected),
0 - weak. Brackets envelop syllables that may or may not appear (end of a line; dolnik’s interval). English
examples provided for classical meters.
01|01|01|01|01(00)
Thus was|I, slee|ping, by | a bro|ther’s hand
Of life,| of crown,| of queen,| at once| dispatch’d
10|10|10|1(00)
Tell me| not in| mournful | numbers,
Life is | but an | empty | dream
100|100|100|1(00)
Brightest and| best of the | sons of the| morning
010|010|010|01(00)
Oh, hush thee,| my baby|, thy sire was | a knight
Thy mother | a lady| both lovely | and bright
001|001(00)
He is gone | on the moun|tain</p>
      <p>He is lost | to the for|est
Dolnik
(00)1(0)01(0)01(00)</p>
      <p>
        Comment
Iambic Pentameter
3-ictus Dolnik
based on number of
stressed positions (3)
but unstressed syllable
interval is limited
to 1-2 syllables
Table B.4
Distinctive topics (words translated) in three meters compared to Gasparov’s descriptions [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Top 10 topics
with most deviation from the mean are listed. Topics that could be relevant for halo are highlighted
Meter
      </p>
      <p>Halo (Gasparov)</p>
      <p>Topic</p>
      <p>Top words
Trochee-5-fm</p>
      <p>Night, Landscape, Love, Death, Road
Trochee-3-fm</p>
      <p>Song, Road, Nature, Yearning, Love, Death
Iamb-4-dm
dangerous movement through space / love
69
41
61
25
66
45
38
39
31
76
77
23
43
51
51
10
21
22
31
62
59
1
6
4
68
12
61
80
50
to know, to live, to be, to die, nothing
war, to go, soldier, battle, bullet
goodbye, last, to go (away), hand, parting
wind, steppe, sand, grass, desert
garden, green, leaf, branch, linden
train, wheel, smoke, to fly, wind
window, house, wall, room, table
water, river, shore, to swim, lake
to go, path, road, to cross, leg
to sing, song, nightingale, voice
matter, take, give, comrade, most
red, to go, oi, white, ”ka” (folksong love topic)
snow, white, ice, winter, snowy
door, house, enter, window, wait
woods, pine, green, tree
wind, leaf, autumn, rain, autumn
dream, to dream, night, to wake, morning
night, darkness, murk, dark
to go, path, road, to cross, leg
city, tower, wall, stone
horror, death, evil, blood
star, world, sky, earth, abyss
shade, dream, ghost, pale
soul, dream, beauty, world, power
hour, wait, to come, soon, or
god, temple, tsar, before, world
goodbye, last, to go (away), hand, parting
city, road, house, light, to go
poem, write, poet, book, word
Table B.5
Median Adjusted Rand Index values for within-meter clustering (Fig. 2c) in diferent samples (ARI @ n
poems per sample), using LDA models with varying k number of topics . ARI family column holds ARI
values for between-meter similarity tests (Fig. 3a). The robust performance of clustering in various models
shows that number of topics has little influence on the experimental results, and, thus, surprisingly, is not
really an influential variable. LDA with 20 topics shows one of the highest performances at 250 poems
per sample, which further highlights the efectiveness of the reduction of semantic information. There is,
however, a slight noticeable trade-of between local (within-meter) and global (between-meter) recognition
(models with 80 and 100 topics seem to provide the most balanced performance). Additional tests were
made for LDA models trained on the original Document-Term Matrix (without replacements of less frequent
words with their more frequent semantic neighbours).</p>
      <p>ARI family</p>
      <p>DTM
w/ replacement
w/o replacement
10
20
40
60
80
100
120
150
200
20
80
150
0.47
0.45
0.44
0.41
0.44
0.45
0.45
0.42
0.42
0.46
0.41
0.45</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>E.</given-names>
            <surname>Asgari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ghassemi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Finlayson</surname>
          </string-name>
          . “
          <article-title>Confirming the themes and interpretive unity of Ghazal poetry using topic models”</article-title>
          .
          <source>In: Neural Information Processing Systems (NIPS) Workshop for Topic Models</source>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A. C.</given-names>
            <surname>Barbrook</surname>
          </string-name>
          et al. “
          <article-title>The phylogeny of The Canterbury Tales”</article-title>
          .
          <source>In: Nature</source>
          <volume>394</volume>
          .6696 (
          <issue>Aug</issue>
          .
          <year>1998</year>
          ), pp.
          <fpage>839</fpage>
          -
          <lpage>839</lpage>
          . issn:
          <fpage>1476</fpage>
          -
          <lpage>4687</lpage>
          . doi:
          <volume>10</volume>
          .1038/29667. (Visited on 09/17/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A. T. J.</given-names>
            <surname>Barron</surname>
          </string-name>
          et al. “
          <article-title>Individuals, institutions, and innovation in the debates of the French Revolution”</article-title>
          .
          <source>In: Proceedings of the National Academy of Sciences 115.18 (May</source>
          <year>2018</year>
          ), pp.
          <fpage>4607</fpage>
          -
          <lpage>4612</lpage>
          . issn:
          <fpage>0027</fpage>
          -
          <lpage>8424</lpage>
          ,
          <fpage>1091</fpage>
          -
          <lpage>6490</lpage>
          . doi:
          <volume>10</volume>
          .1073/pnas.1717729115. (Visited on 07/26/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bicego</surname>
          </string-name>
          et al. “
          <article-title>Investigating Topic Models' Capabilities in Expression Microarray Data Classification”</article-title>
          .
          <source>In: IEEE/ACM Transactions on Computational Biology and Bioinformatics 9</source>
          .6 (
          <issue>2012</issue>
          ), pp.
          <fpage>1831</fpage>
          -
          <lpage>1836</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Blei</surname>
          </string-name>
          . “
          <article-title>Latent Dirichlet Allocation”</article-title>
          .
          <source>In: Journal of Machine Learning Research</source>
          <volume>3</volume>
          (
          <year>2003</year>
          ), pp.
          <fpage>993</fpage>
          -
          <lpage>1022</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R.</given-names>
            <surname>Bouckaert</surname>
          </string-name>
          et al. “
          <article-title>Mapping the Origins and Expansion of the Indo-European Language Family”</article-title>
          .
          <source>In: Science</source>
          <volume>337</volume>
          .6097 (
          <issue>Aug</issue>
          .
          <year>2012</year>
          ), pp.
          <fpage>957</fpage>
          -
          <lpage>960</lpage>
          . issn:
          <fpage>0036</fpage>
          -
          <lpage>8075</lpage>
          ,
          <fpage>1095</fpage>
          -
          <lpage>9203</lpage>
          . doi:
          <volume>10</volume>
          .1126/science.1219669. (Visited on 09/17/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>F.</given-names>
            <surname>Cafiero</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.-B.</given-names>
            <surname>Camps</surname>
          </string-name>
          . “
          <article-title>Why Molière most likely did write his plays”</article-title>
          .
          <source>In: Science Advances 5.11 (Nov</source>
          .
          <year>2019</year>
          ). issn:
          <fpage>2375</fpage>
          -
          <lpage>2548</lpage>
          . doi:
          <volume>10</volume>
          .1126/sciadv.aax5489.
          <source>(Visited on 12/16/</source>
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>P.</given-names>
            <surname>DiMaggio</surname>
          </string-name>
          , M. Nag, and
          <string-name>
            <given-names>D.</given-names>
            <surname>Blei</surname>
          </string-name>
          . “
          <article-title>Exploiting affinities between topic modeling and the sociological perspective on culture: Application to newspaper coverage of U.S. government arts funding”</article-title>
          .
          <source>In: Poetics</source>
          <volume>41</volume>
          .6 (
          <issue>Dec</issue>
          .
          <year>2013</year>
          ), pp.
          <fpage>570</fpage>
          -
          <lpage>606</lpage>
          . issn: 0304422X. doi:
          <volume>10</volume>
          .1016/j.poetic.
          <year>2013</year>
          .
          <volume>08</volume>
          .
          <fpage>004</fpage>
          . (Visited on 03/10/
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>H.-G. Drost.</surname>
          </string-name>
          “
          <article-title>Philentropy: Information Theory and Distance Quantification with R”</article-title>
          .
          <source>In: Journal of Open Source Software 3.26 (June</source>
          <year>2018</year>
          ). issn:
          <fpage>2475</fpage>
          -
          <lpage>9066</lpage>
          . doi:
          <volume>10</volume>
          .21105/joss .00765. (Visited on 09/13/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>W. H.</given-names>
            <surname>Durham</surname>
          </string-name>
          . “
          <article-title>Advances in Evolutionary Culture Theory”</article-title>
          .
          <source>In: Annual Review of Anthropology 19.1</source>
          (
          <issue>1990</issue>
          ), pp.
          <fpage>187</fpage>
          -
          <lpage>210</lpage>
          . doi:
          <volume>10</volume>
          .1146/annurev.an.
          <volume>19</volume>
          .100190.001155.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J.</given-names>
            <surname>Felsenstein</surname>
          </string-name>
          . “
          <article-title>Confidence Limits on Phylogenies: An Approach Using the Bootstrap”</article-title>
          .
          <source>In: Evolution 39.4</source>
          (
          <issue>1985</issue>
          ), pp.
          <fpage>783</fpage>
          -
          <lpage>791</lpage>
          . issn:
          <fpage>1558</fpage>
          -
          <lpage>5646</lpage>
          . doi:
          <volume>10</volume>
          .1111/j.1558-
          <fpage>5646</fpage>
          .
          <year>1985</year>
          .tb00420.x.
          <source>(Visited on 07/26/</source>
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Gasparov</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Tarlinskaja</surname>
          </string-name>
          . “
          <article-title>The Linguistics of Verse”</article-title>
          .
          <source>In: The Slavic and East European Journal 52.2</source>
          (
          <issue>2008</issue>
          ), pp.
          <fpage>198</fpage>
          -
          <lpage>207</lpage>
          . issn:
          <fpage>0037</fpage>
          -
          <lpage>6752</lpage>
          . doi:
          <volume>10</volume>
          . 2307 / 20459662. (Visited on 07/26/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M.</given-names>
            <surname>Gasparov</surname>
          </string-name>
          .
          <article-title>A History of European Versification</article-title>
          . Oxford, New York: Oxford University Press,
          <year>July 1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>M.</given-names>
            <surname>Gasparov</surname>
          </string-name>
          .
          <article-title>Metr i smysl: ob odnom iz mekhanizmov kulturnoi pamiati</article-title>
          . Moscow: Izdatelskii tsentr
          <string-name>
            <surname>RGGU</surname>
          </string-name>
          ,
          <year>1999</year>
          . (Visited on 07/26/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>R. D.</given-names>
            <surname>Gray</surname>
          </string-name>
          and
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Jordan</surname>
          </string-name>
          . “
          <article-title>Language trees support the express-train sequence of Austronesian expansion”</article-title>
          .
          <source>In: Nature</source>
          <volume>405</volume>
          .6790 (
          <year>June 2000</year>
          ), pp.
          <fpage>1052</fpage>
          -
          <lpage>1055</lpage>
          . issn:
          <fpage>1476</fpage>
          -
          <lpage>4687</lpage>
          . doi:
          <volume>10</volume>
          .1038/35016575. (Visited on 09/17/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>E.</given-names>
            <surname>Grishina</surname>
          </string-name>
          et al. “
          <article-title>Poeticheskii korpus v ramkah NKRIA: obschaia struktura i perspektivy ispolzovania”</article-title>
          . In: Natsionalnii korpus russkogo iazyka:
          <fpage>2006</fpage>
          -
          <lpage>2008</lpage>
          .
          <article-title>Novye rezultaty i perspektivy</article-title>
          .
          <source>St. Petersburg: Nestor-Istoria</source>
          ,
          <year>2009</year>
          , pp.
          <fpage>71</fpage>
          -
          <lpage>113</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>M.</given-names>
            <surname>Gronas</surname>
          </string-name>
          .
          <source>Cognitive Poetics and Cultural Memory: Russian Literary Mnemonics</source>
          .
          <volume>1</volume>
          <fpage>edition</fpage>
          . New York: Routledge,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>B.</given-names>
            <surname>Grün</surname>
          </string-name>
          and
          <string-name>
            <given-names>K.</given-names>
            <surname>Hornik</surname>
          </string-name>
          .
          <article-title>“topicmodels: An R Package for Fitting Topic Models”</article-title>
          .
          <source>In: Journal of Statistical Software 40.13</source>
          (
          <year>2011</year>
          ), pp.
          <fpage>1</fpage>
          -
          <lpage>30</lpage>
          . doi:
          <volume>10</volume>
          .18637/jss.v040.
          <year>i13</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>T. N.</given-names>
            <surname>Haider</surname>
          </string-name>
          . “
          <article-title>Diachronic Topics in New High German Poetry”</article-title>
          .
          <source>In: Proceedings of the International Digital Humantities Conference. Utrecht</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          . url: https://dev .clariah.nl/files/dh2019/boa/1031.html (visited on 09/17/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>D.</given-names>
            <surname>Hall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Jurafsky</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          . “
          <article-title>Studying the History of Ideas Using Topic Models”</article-title>
          .
          <source>In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. EMNLP '08</source>
          .
          <string-name>
            <surname>Stroudsburg</surname>
          </string-name>
          , PA, USA: Association for Computational Linguistics,
          <year>2008</year>
          , pp.
          <fpage>363</fpage>
          -
          <lpage>371</lpage>
          . (Visited on 12/11/
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>E.</given-names>
            <surname>Henderson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Desrosiers</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Chirico</surname>
          </string-name>
          .
          <source>ghibli: Studio Ghibli Colour Palettes. Apr</source>
          .
          <year>2020</year>
          . url: https://CRAN.R-project.
          <source>org/package=ghibli (visited on 09/13/</source>
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>J.</given-names>
            <surname>Henrich</surname>
          </string-name>
          . “
          <article-title>Demography and Cultural Evolution: How Adaptive Cultural Processes can Produce Maladaptive Losses: The Tasmanian Case”</article-title>
          .
          <source>In: American Antiquity 69.2</source>
          (
          <issue>2004</issue>
          ), pp.
          <fpage>197</fpage>
          -
          <lpage>214</lpage>
          . issn:
          <fpage>0002</fpage>
          -
          <lpage>7316</lpage>
          . doi:
          <volume>10</volume>
          .2307/4128416. (Visited on 09/17/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>L. J.</given-names>
            <surname>Hubert</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Arabie</surname>
          </string-name>
          . “
          <article-title>Comparing partitions”</article-title>
          .
          <source>In: Journal of Classification 2</source>
          .
          <fpage>2</fpage>
          -
          <lpage>3</lpage>
          (
          <year>1985</year>
          ), pp.
          <fpage>193</fpage>
          -
          <lpage>218</lpage>
          . issn:
          <fpage>0176</fpage>
          -
          <lpage>4268</lpage>
          (Print).
          <source>doi: 10</source>
          .1007/BF01908075.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>R.</given-names>
            <surname>Jakobson</surname>
          </string-name>
          . “
          <article-title>Toward a Description of Mácha's Verse”</article-title>
          .
          <source>In: R. Jakobson. Selected Writings</source>
          . Vol.
          <volume>5</volume>
          . The Hague, Paris, NY,
          <year>1979</year>
          , pp.
          <fpage>433</fpage>
          -
          <lpage>485</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>K.</given-names>
            <surname>Korchagin</surname>
          </string-name>
          . “
          <article-title>Poezija XX veka v poeticheskom podkorpuse Natsional'nogo korpusa russkogo iazyka: problema reprezentativnosti”</article-title>
          . In:
          <article-title>Trudy instituta im</article-title>
          . V.V.
          <article-title>Vinogradova 6 (</article-title>
          <year>2015</year>
          ), pp.
          <fpage>235</fpage>
          -
          <lpage>256</lpage>
          . (Visited on 07/26/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kullback</surname>
          </string-name>
          and
          <string-name>
            <given-names>R. A.</given-names>
            <surname>Leibler</surname>
          </string-name>
          . “
          <article-title>On Information and Sufficiency”</article-title>
          .
          <source>In: The Annals of Mathematical Statistics 22.1</source>
          (
          <issue>1951</issue>
          ), pp.
          <fpage>79</fpage>
          -
          <lpage>86</lpage>
          . issn:
          <fpage>0003</fpage>
          -
          <lpage>4851</lpage>
          . (Visited on 07/26/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>M.</given-names>
            <surname>Mauch</surname>
          </string-name>
          et al. “
          <article-title>The evolution of popular music: USA 1960-2010”</article-title>
          .
          <source>In: Royal Society Open Science</source>
          <volume>2</volume>
          (
          <year>2015</year>
          ). url: http://dx.doi.org/10.1098/rsos.150081 (visited on 07/26/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>J.</given-names>
            <surname>Murdock</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Allen</surname>
          </string-name>
          , and
          <string-name>
            <surname>S. DeDeo.</surname>
          </string-name>
          “
          <article-title>Exploration and Exploitation of Victorian Science in Darwin's Reading Notebooks”</article-title>
          .
          <source>In: Cognition</source>
          <volume>159</volume>
          (
          <issue>Feb</issue>
          .
          <year>2017</year>
          ), pp.
          <fpage>117</fpage>
          -
          <lpage>126</lpage>
          . issn:
          <volume>00100277</volume>
          . doi:
          <volume>10</volume>
          .1016/j.cognition.
          <year>2016</year>
          .
          <volume>11</volume>
          .
          <fpage>012</fpage>
          . (Visited on 07/26/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>B.</given-names>
            <surname>Navarro-Colorado</surname>
          </string-name>
          . “
          <article-title>On Poetic Topic Modeling: Extracting Themes and Motifs From a Corpus of Spanish Poetry”</article-title>
          .
          <source>In: Frontiers in Digital Humanities</source>
          <volume>5</volume>
          (
          <year>2018</year>
          ), p.
          <fpage>15</fpage>
          . issn:
          <fpage>2297</fpage>
          -
          <lpage>2668</lpage>
          . doi:
          <volume>10</volume>
          .3389/fdigh.
          <year>2018</year>
          .
          <volume>00015</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [32]
          <string-name>
            <surname>M. J. O'Brien</surname>
            and
            <given-names>R. L.</given-names>
          </string-name>
          <string-name>
            <surname>Lyman</surname>
          </string-name>
          . “
          <article-title>Evolutionary archeology: Current status and future prospects”</article-title>
          .
          <source>In: Evolutionary Anthropology: Issues, News, and Reviews 11.1</source>
          (
          <issue>2002</issue>
          ), pp.
          <fpage>26</fpage>
          -
          <lpage>36</lpage>
          . issn:
          <fpage>1520</fpage>
          -
          <lpage>6505</lpage>
          . doi:
          <volume>10</volume>
          .1002/evan.10007. (Visited on 09/17/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>B.</given-names>
            <surname>Orekhov</surname>
          </string-name>
          .
          <article-title>Bashkirskii stikh XX veka. Korpusnoe issledovanie</article-title>
          .
          <source>St. Petersburg: Aletejia</source>
          ,
          <year>2019</year>
          . (Visited on 07/26/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>T. L.</given-names>
            <surname>Pedersen</surname>
          </string-name>
          .
          <source>patchwork: The Composer of Plots. June</source>
          <year>2020</year>
          . url: https://CRAN.R-p roject.
          <source>org/package=patchwork (visited on 09/13/</source>
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>A.</given-names>
            <surname>Piperski</surname>
          </string-name>
          . “
          <article-title>Semantic halo of a meter: a keyword-based approach”</article-title>
          . In:
          <article-title>Kompiuternaia lingvistka i intellectualnyie tekhnoloigii</article-title>
          . Vol.
          <volume>2</volume>
          .
          <article-title>Kompiuternaia lingvistika: lingvisticheskiie issledovaniia</article-title>
          . Moscow: RGGU,
          <year>2017</year>
          , pp.
          <fpage>342</fpage>
          -
          <lpage>354</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>P.</given-names>
            <surname>Plechac</surname>
          </string-name>
          and
          <string-name>
            <given-names>T. N.</given-names>
            <surname>Haider</surname>
          </string-name>
          . “
          <article-title>Mapping Topic Evolution Across Poetic Traditions”</article-title>
          . In: arXiv:
          <year>2006</year>
          .15732 [cs, stat] (
          <year>Aug</year>
          .
          <year>2020</year>
          ), pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          . url: http://arxiv.org/abs/
          <year>2006</year>
          .15732 (visited on 09/28/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>A.</given-names>
            <surname>Powell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shennan</surname>
          </string-name>
          , and M. G. Thomas. “
          <article-title>Late Pleistocene Demography and the Appearance of Modern Human Behavior”</article-title>
          .
          <source>In: Science</source>
          <volume>324</volume>
          .5932 (
          <year>June 2009</year>
          ), pp.
          <fpage>1298</fpage>
          -
          <lpage>1301</lpage>
          . issn:
          <fpage>0036</fpage>
          -
          <lpage>8075</lpage>
          ,
          <fpage>1095</fpage>
          -
          <lpage>9203</lpage>
          . doi:
          <volume>10</volume>
          .1126/science.1170165. (Visited on 09/17/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>R Core</given-names>
            <surname>Team. R:</surname>
          </string-name>
          <article-title>A Language and Environment for Statistical Computing</article-title>
          . Vienna, Austria: R Foundation for Statistical Computing,
          <year>2019</year>
          . url: https://www.R-project.
          <source>org/.</source>
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>D. C.</given-names>
            <surname>Rubin</surname>
          </string-name>
          .
          <article-title>Memory in Oral Traditions: The Cognitive Psychology of Epic, Ballads, and</article-title>
          <string-name>
            <given-names>Counting-Out</given-names>
            <surname>Rhymes</surname>
          </string-name>
          . Oxford University Press USA,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [40]
          <string-name>
            <surname>Russian</surname>
            <given-names>National</given-names>
          </string-name>
          <string-name>
            <surname>Corpus</surname>
          </string-name>
          .
          <year>2003</year>
          . url: https://ruscorpora.ru/new/en/index.html (visited on 09/13/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>S.</given-names>
            <surname>Sbalchiero</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Eder</surname>
          </string-name>
          . “
          <article-title>Topic modeling, long texts and the best number of topics. Some Problems and solutions”</article-title>
          .
          <source>In: Quality &amp; Quantity</source>
          <volume>54</volume>
          .4 (
          <issue>Aug</issue>
          .
          <year>2020</year>
          ), pp.
          <fpage>1095</fpage>
          -
          <lpage>1108</lpage>
          . issn:
          <fpage>1573</fpage>
          -
          <lpage>7845</lpage>
          . doi:
          <volume>10</volume>
          .1007/s11135-020-00976-w. url: https://doi.org/10.1007/s1113 5-
          <fpage>020</fpage>
          -00976-w (visited on 09/17/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [42]
          <string-name>
            <given-names>C.</given-names>
            <surname>Schöch</surname>
          </string-name>
          . “
          <article-title>Topic Modeling Genre: An Exploration of French Classical and Enlightenment Drama”</article-title>
          .
          <source>In: Digital Humanities Quarterly</source>
          <volume>011</volume>
          .2 (May
          <year>2017</year>
          ). issn:
          <fpage>1938</fpage>
          -
          <lpage>4122</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [43]
          <string-name>
            <surname>I. Segalovich.</surname>
          </string-name>
          “
          <string-name>
            <given-names>A Fast</given-names>
            <surname>Morphological</surname>
          </string-name>
          <article-title>Algorithm with Unknown Word Guessing Induced by a Dictionary for a Web Search Engine”</article-title>
          .
          <source>In: MLMTA</source>
          .
          <year>2003</year>
          , pp.
          <fpage>273</fpage>
          -
          <lpage>280</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          [44]
          <string-name>
            <given-names>M.</given-names>
            <surname>Shapir</surname>
          </string-name>
          . “”
          <article-title>Semanticheskii oreol metra”: termin i poniatie”</article-title>
          . In: Shapir,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Universum versus: iazyk, stikh, smysl v russkoi poezii XVIII-XX vekov</article-title>
          , volume
          <volume>2</volume>
          . Moskva: Iazyki slavianskoi kul tury,
          <year>2015</year>
          , pp.
          <fpage>395</fpage>
          -
          <lpage>404</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          [45]
          <string-name>
            <given-names>A.</given-names>
            <surname>Shelya</surname>
          </string-name>
          and
          <string-name>
            <surname>O. Sobchuk. “</surname>
          </string-name>
          <article-title>The shortest species: how the length of Russian poetry changed (1750-1921)”</article-title>
          .
          <source>In: Studia Metrica et Poetica 4</source>
          .1 (
          <issue>Aug</issue>
          .
          <year>2017</year>
          ), pp.
          <fpage>66</fpage>
          -
          <lpage>84</lpage>
          . issn:
          <fpage>2346</fpage>
          -
          <lpage>691X</lpage>
          . doi:
          <volume>10</volume>
          .12697/smp.
          <year>2017</year>
          .
          <volume>4</volume>
          .1.
          <fpage>03</fpage>
          . (Visited on 07/26/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          [46]
          <string-name>
            <given-names>J.</given-names>
            <surname>Silge</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Robinson</surname>
          </string-name>
          .
          <article-title>Text Mining with R: A tidy approach</article-title>
          .
          <year>2020</year>
          . url: https://ww w.tidytextmining.com/ (visited on 07/27/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          [47]
          <string-name>
            <surname>S. G.</surname>
          </string-name>
          <article-title>da Silva</article-title>
          and
          <string-name>
            <given-names>J. J.</given-names>
            <surname>Tehrani</surname>
          </string-name>
          . “
          <article-title>Comparative phylogenetic analyses uncover the ancient roots of Indo-European folktales”</article-title>
          .
          <source>In: Royal Society Open Science 3.1</source>
          (
          <issue>2016</issue>
          ), p.
          <fpage>150645</fpage>
          . doi:
          <volume>10</volume>
          .1098/rsos.150645. (Visited on 09/17/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          [48]
          <string-name>
            <given-names>O.</given-names>
            <surname>Sobchuk</surname>
          </string-name>
          . “
          <article-title>Charting artistic evolution: an essay in theory”</article-title>
          .
          <source>PhD thesis</source>
          . Tartu: Tartu University,
          <year>2018</year>
          . (Visited on 07/26/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref47">
        <mixed-citation>
          [49]
          <string-name>
            <given-names>K.</given-names>
            <surname>Taranovskii.</surname>
          </string-name>
          “
          <article-title>O vzaimootnosheniah stikhotvornogo metra i tematiki”</article-title>
          . In: American Contributions to the
          <source>Fifth International Congress of Slavists</source>
          <volume>1</volume>
          (
          <year>1963</year>
          ), pp.
          <fpage>287</fpage>
          -
          <lpage>332</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref48">
        <mixed-citation>
          [50]
          <string-name>
            <given-names>M.</given-names>
            <surname>Tarlinskaja</surname>
          </string-name>
          and
          <string-name>
            <given-names>N.</given-names>
            <surname>Oganesova</surname>
          </string-name>
          . “
          <article-title>Meter and Meaning: The Semantic Halo of Verse Form in English Romantic Lyrical Poems (Iambic and Trochaic Tetrameter)”</article-title>
          .
          <source>In: The American Journal of Semiotics 4.3/4</source>
          (
          <year>1986</year>
          ), pp.
          <fpage>85</fpage>
          -
          <lpage>106</lpage>
          . doi: https://doi.org/10.5840 /ajs198643/
          <fpage>422</fpage>
          . (Visited on 07/26/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref49">
        <mixed-citation>
          [51]
          <string-name>
            <given-names>J.</given-names>
            <surname>Tehrani</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Collard</surname>
          </string-name>
          . “
          <article-title>Do Transmission Isolating Mechanisms (TRIMS) influence cultural evolution? Evidence from patterns of textile diversity within and between Iranian tribal groups”</article-title>
          . In:
          <article-title>Understanding Cultural Transmission in Anthropology: A Critical Synthesis (eds</article-title>
          . Ellen,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Lycett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            &amp;
            <surname>Johns</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          <source>Berghahn)</source>
          . Vol.
          <volume>26</volume>
          .
          <year>2013</year>
          , pp.
          <fpage>148</fpage>
          -
          <lpage>164</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref50">
        <mixed-citation>
          [52]
          <string-name>
            <given-names>C.</given-names>
            <surname>Tennie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Call</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Tomasello</surname>
          </string-name>
          . “
          <article-title>Ratcheting up the ratchet: on the evolution of cumulative culture”</article-title>
          .
          <source>In: Philosophical Transactions of the Royal Society B: Biological Sciences 364.1528 (Aug</source>
          .
          <year>2009</year>
          ), pp.
          <fpage>2405</fpage>
          -
          <lpage>2415</lpage>
          . issn:
          <fpage>0962</fpage>
          -
          <lpage>8436</lpage>
          . doi:
          <volume>10</volume>
          .1098/rstb.
          <year>2009</year>
          .
          <volume>0052</volume>
          . (Visited on 09/17/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref51">
        <mixed-citation>
          [53]
          <string-name>
            <given-names>M.</given-names>
            <surname>Trunin</surname>
          </string-name>
          . “
          <article-title>Towards the concept of semantic halo”</article-title>
          .
          <source>In: Studia Metrica et Poetica 4</source>
          .2 (
          <issue>Dec</issue>
          .
          <year>2017</year>
          ), pp.
          <fpage>41</fpage>
          -
          <lpage>66</lpage>
          . issn:
          <fpage>2346</fpage>
          -
          <lpage>691X</lpage>
          . doi:
          <volume>10</volume>
          .12697/smp.
          <year>2017</year>
          .
          <volume>4</volume>
          .2.
          <fpage>03</fpage>
          . (Visited on 07/26/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref52">
        <mixed-citation>
          [54]
          <string-name>
            <surname>G. Vinokur.</surname>
          </string-name>
          “Vol'
          <article-title>nye iamby Pushkina”</article-title>
          .
          <source>In: Pushkin i ego sovremenniki: Materialy i issledovania</source>
          . Vol.
          <volume>38</volume>
          /39. Leningrad,
          <year>1930</year>
          , pp.
          <fpage>23</fpage>
          -
          <lpage>26</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref53">
        <mixed-citation>
          [55]
          <string-name>
            <given-names>M.</given-names>
            <surname>Woźniak</surname>
          </string-name>
          et al. “
          <article-title>Linguistic measures of chemical diversity and the “keywords” of molecular collections”</article-title>
          .
          <source>In: Scientific Reports 8</source>
          .1 (May
          <year>2018</year>
          ), p.
          <fpage>7598</fpage>
          . issn:
          <fpage>2045</fpage>
          -
          <lpage>2322</lpage>
          . doi:
          <volume>10</volume>
          .1038/s41598-018-25440-6. (Visited on 07/26/
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref54">
        <mixed-citation>
          [56]
          <string-name>
            <surname>T.-I. Yang</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Torget</surname>
            , and
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Mihalcea</surname>
          </string-name>
          . “
          <article-title>Topic Modeling on Historical Newspapers”</article-title>
          .
          <source>In: Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage</source>
          ,
          <source>Social Sciences, and Humanities. Portland OR</source>
          ,
          <year>2011</year>
          , pp.
          <fpage>96</fpage>
          -
          <lpage>104</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref55">
        <mixed-citation>
          [57]
          <string-name>
            <given-names>G.</given-names>
            <surname>Yu</surname>
          </string-name>
          et al. “
          <article-title>Two Methods for Mapping and Visualizing Associated Data on Phylogeny Using Ggtree”</article-title>
          .
          <source>In: Molecular Biology and Evolution</source>
          <volume>35</volume>
          .12 (
          <year>2018</year>
          ), pp.
          <fpage>3041</fpage>
          -
          <lpage>3043</lpage>
          . issn:
          <fpage>0737</fpage>
          -
          <lpage>4038</lpage>
          . doi:
          <volume>10</volume>
          .1093/molbev/msy194.
        </mixed-citation>
      </ref>
      <ref id="ref56">
        <mixed-citation>
          [58]
          <string-name>
            <given-names>A.</given-names>
            <surname>Zeileis</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Kleiber</surname>
          </string-name>
          . ineq: Measuring Inequality,
          <article-title>Concentration, and</article-title>
          <string-name>
            <surname>Poverty. July</surname>
          </string-name>
          <year>2014</year>
          . url: https://CRAN.R-project.
          <source>org/package=ineq (visited on 09/13/</source>
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref57">
        <mixed-citation>
          [59]
          <string-name>
            <surname>J. van Zundert. “</surname>
          </string-name>
          <article-title>Computational methods and tools”</article-title>
          . In: Handbook of Stemmatology. Ed. by
          <string-name>
            <given-names>P.</given-names>
            <surname>Roelli. De Gruyter Reference. De Gruyter</surname>
          </string-name>
          ,
          <year>2020</year>
          , pp.
          <fpage>292</fpage>
          -
          <lpage>356</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>