<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>History Repeats: Historical Phase Recognition from Short Texts</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Fabio Celli</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Valerio Basile</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Gruppo Maggioli</institution>
          ,
          <addr-line>Via Bornaccino 101, Santarcangelo di Romangna, 47822</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Università di Torino</institution>
          ,
          <addr-line>Via Pessinetto 12, 10149, Torino</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper introduces a new multi-class classification task: the prediction of the Structural-Demographic phase of historical cycles - such as growth, impoverishment and crisis - from text describing historical events. To achieve this, we leveraged data from the Seshat project, annotated it following specific guidelines and then evaluated the consistency between three annotators. The classification experiments, with transformers and Large Language Models, show that 2 of 5 phases can be detected with good accuracy. We believe that this task could have a great impact on comparative history and can be helped by event extraction in NLP.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Cultural Analytics</kwd>
        <kwd>Structural Demographic Theory</kwd>
        <kwd>LLMs</kwd>
        <kwd>NLP for the Humanities</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction And Background</title>
      <p>
        In the last decade, at least since Brexit [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], many
countries in the world experienced a generalized polarization
and phenomena of toxic language online have grown
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Hate speech [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], misogyny [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], conspiracy theories
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and related phenomena are just visible
manifestations of deep structural social crises, ushering in periods
of shifting world order [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. While crises may appear
sudden, they are often rooted in underlying factors like
demographics, geopolitics, technological advancements,
and historical-economic cycles. Using scientific method,
mathematical modelling and the Structural Demographic The actors interact in five phases during the secular cycle,
Theory (SDT) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] it was possible to formalise secular cy- progressively increasing social and political instability:
cles [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], that typically last between 75 to 100 years [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ],
and predict outbreaks of political instability in complex
societies based on the rate of past crises [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. The SDT
defines three actors and five phases of the secular cycle.
      </p>
      <p>
        The three key actors are:
society and are eligible to become part of the
state. Who is considered part of the elite and how
someone gains or loses elite status depends on
the type of government and the power dynamics
within a society.
• The state, formed by roughly 2% of the society, is
the government that enforces its will and
manages resources from the population. It is
composed by one or more elite groups, depending on
the social structure, and it crystallizes the culture
to keep the society alive.
1. The growth phase. During this phase a fresh and
efective culture creates social cohesion, the
economy is growing rapidly and the state is
expanding its control over the population. This leads to
increased economic prosperity and stability but
raises the problem of sustainability. Periods of
reconstruction immediately following wars, like
post-war Italy in the 1950s, are examples of this
phase.
2. The population immiseration phase. The
population continues to grow in number while the
economy slows down. This happens because over
the long term the rate of return on capital is
typically greater than the growth rate of population
salaries [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], as result the elites gets richer and the
population gets poorer. Moreover, demography
has a strong impact on the wealth of the
population: the more workers of the same type are
available, the less likely their wages are to grow.
      </p>
      <p>
        The state’s ability to extract resources from the
population reaches its limits in this phase. This
• The population, which is the source of the
society’s resources and manpower, represents
approximately 90% of the entire society and is the
part that follows instructions to produce goods
and wealth, consuming only a small part of it.
• The elites, who typically cover around 8% of the
society, are the groups of people in charge of
ifnding potential solutions to the problems of the
The dynamics described by the SDT are represented in
ifgure 1 [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. SDT has been used to explain a wide range
can lead to increasing inequality, and social un- of historical events, including the French Revolution, the
rest begins. United States in the 1890s and 1970s American Civil War [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], the fall of the Qing Dynasty
are an example of this phase. [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], the Russian Revolution and the instability in the US
3. The elite overproduction phase. The population in recent years.
      </p>
      <p>tries to access the elite ranks but overloads the In this paper we propose a novel multi-class
classifisocial lift mechanisms and yields a reduced capa- cation task: given a text describing the historical events
bility of the elite to solve problems in the society, of a decade, find the appropriate SDT phase label. To do
which raise the probability to have societal insta- so we exploited historical data from the Seshat project,
bility. USSR in the 1950s and US in the 1990s are produced textual descriptions for decades in the history
examples of this phase. of human societies and annotated each decade with SDT
4. The state stress phase. The state’s ability to gov- phases following specific annotation guidelines. We
comern the population and foster cooperation be- puted inter-annotator agreement between 3 annotators
tween population and elites begins to decline, and experimented with LLMs in classification. The paper
and the elites become increasingly fragmented. is structured as follows: in Section 2 we will describe the
This can lead to widespread violence and civil data, the guidelines for the annotation (Section 3), the
war. Moreover, the state tends to be in financial classification experiments in Section 4, the conclusion
distress as a consequence of slowed economy and and direction for future work in Section 5.
internal fragmentation, thus any triggering event
that the state cannot manage can break into a
crisis. Germany in the 1920s is an example. 2. Data
5. The crisis, collapse or recovery phase. The state
is either reformed by the elites who find an
agreement or overthrown by internal or external forces.</p>
      <p>At the end of this phase a new social equilibrium
is found and a new period of stability begins,
restarting the cycle. Examples are France in the
1790s, UK in the 1940s, US in the 1860s under civil
war and also in 1930s under New Deal reforms.</p>
      <p>
        It is not easy to design a dataset for historical data. There
are specific datasets for event detection from text [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ],
for paleoclimatology [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], for census analysis through
time [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] and for information extraction from historical
documents [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], but there are few long-term historical
datasets for Structural-Demographic analysis. Crucially
the Seshat project [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] produced a dataset that contains
machine-readable historical information about global
history. The basic concept of Seshat is to provide
quantitative and structured or semi-structured data about the
evolution of societies, defined as political units (polities)
from 35 sampling points across the globe in a time
window from roughly 10000 BC to 1900 CE, sampled with
a time-step of 100 years. A sampling frequency of 100
years is too much coarse-grained, not suitable to track the
internal phases of the secular cycle, thus we resampled
the data with a sampling frequency to 10 years,
manually integrating data and descriptions from Seshat and
from Wikipedia. To do so, we followed these general
guidelines:
• For each polity in Sesaht create a number of rows
to represent each decade. There must be no gaps
between decades. If needed, add polities to fill
the gaps searching in Wikipedia.
• Read the description of the polity provided in
      </p>
      <p>Seshat, identify dates and map the content to the
corresponding decade.
• Search Wikipedia to find more information about
the polity that can be mapped into decades. Fill
in as much decades as possible. When dates are
uncertain within a specific time period, use the
median decade of that period.
• Summarize the content to fit about 400
characters. Focus on the following types of events: wars
or battles; reforms; rulers; population; elites;
disasters or epidemics; alliances or treaties;
socioconomic context; famines or financial stress;
protests or movements; changes of elite; religions
and philosophies When possible, report the
references about the information found.</p>
      <p>
        We also extended the data to include the polities until the
2010s CE. In order to limit the long and time-consuming
manual data wrangling, we reduced the number of
sampling zones from 35 to 18 but at the same time we kept the
original variety of world regions [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. This, combined
with the extension of the time window, allowed us to
obtain 366 polities (roughly the same number of polities
as Seshat) and 3540 rows with a textual description. We
will call “Chronos” the dataset we produced. It contains
the following features:
• timestamp of each decade,
• the Age indicating the periods of history
(prehistoric, ancient, medieval, early-modern, modern,
post-modern),
• the sampling zone as reported in Figure 2,
• the world regions related to the sampling zones,
• a Polity ID formatted with a standard method:
2 letters to indicate the area of origin of the
culture, 3 letters to indicate the name of the
polity, 1 letter to indicate the type of
society (c=culture/community; n=nomads; e=empire;
k=kingdom; r=republic) and 1 letter to indicate
the periodization (t=terminal; l=late; m=middle;
e=early; f=formative; i=initial; *=any). For
example “EsSpael” is the late Spanish Empire, “ItRomre”
is the early Roman Republic and “CnWwsk*” is
the period of the Warring States under the Wei
      </p>
      <p>Chinese dynasty,
• a short textual description of the decade in Italian</p>
      <p>and English.</p>
      <p>Short texts can contain one or more events and
references. Consider the following examples extracted from
the Chronos dataset:
1. introduction of iron from Vietnam by 300 BC
[Bellwood P. 1997. Prehistory of the Indo-Malaysian
Archipelago: Revised Edition pp. 268-307]. Old</p>
      <p>Malay as lingua franca.
2. Siege of Constantinople in 626. The Byzantines won.</p>
      <p>Problems in the succession to the throne: Kavadh II
is killed in 628. Years of war with Bizantines had
exhausted the Sasanids who were further
weakened by economic decline; religious unrest and
increasing power of the provincial landholders. King
Yazdegerd III (r. 632-651) could not stand against
the Islamic conquest of Persia.</p>
      <p>
        Example 1 contains a socio-economic context about the
Buni culture of Indonesia and example 2 contains events
about war, rulers, socio-economic context, religion and
elite change about the late Sasanian Empire. The events
in the short textual description are specific to the SDT
and help annotators in their decisions about the
historical phase labels. For example a good socio-economic
context may be a clue of a growth phase and a disaster
may trigger a crisis phase. For this reason we did not
exploit the labels proposed in literature, such as
secondlevel HTOED categories or the HISTO classes [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ].
However, we acknowledge that this is an aspect that requires
further research. All events included in the texts were
manually detected, and the data collectors were trained
to recognize key events from the examples provided in
the literature about SDT [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
    </sec>
    <sec id="sec-2">
      <title>3. Annotation and Evaluation</title>
      <p>The main problem with the annotation of phases of
historical cycles is its interpretability. While everyone agrees
the 1789-1799 period in France was a time of crisis,
reaching a consensus on the impact of the 1860s French
intervention in Mexico proves more dificult. Did it trigger
a phase of impoverishment or of elite overproduction?
Moreover, did the rise of Mao Zedong as leader of China
in the 1950s began a phase of growth or continued the
previous crisis?</p>
      <p>We defined the following guidelines for the annotation:</p>
      <sec id="sec-2-1">
        <title>1. Read the textual description to identify key</title>
        <p>events: wars, reforms, rulers, population, elites,
disasters, epidemics, alliances or treaties,
socioeconomic context, famines or financial stress,
protests or movements, religions.</p>
        <p>Trial
base
trained</p>
      </sec>
      <sec id="sec-2-2">
        <title>A single annotator annotated the entire corpus, then</title>
        <p>
          we evaluated the annotation with two diferent trials
involving students, not expert in history. We compared
a subset of data annotated by two students to the same
subset annotated by the principal annotator. The first
trial was done just following the guidelines after a
general explanation of the SDT. The second trial was done,
with diferent students, following the guidelines after a
training session, where the annotation was discussed and
agreed upon. Results, reported in Table 1, show that with
a training session the agreement rises considerably (from
slight to moderate). The base agreement level is
comparable to the one observed in the annotation of hate speech
among 5 trained judges on a non-binary scheme, which
obtained a Fleiss K=0.19 [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ] [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ]. The distribution of the
labels in the Chronos dataset is depicted in Figure 3. In
the standard secular cycle model, the stress phase (label
4) is the most common, followed by the crisis phase (label
5), which is the least common. The other three phases
(labels 1, 2, and 3) occur with roughly equal frequency in
the data.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4. Classification and Discussion</title>
      <p>training set.</p>
      <p>We performed 5-fold cross validation and measured
the precision, recall, and F1 score of the predicted labels
compared against the gold standard. Table 2 shows the
results of the experiments.</p>
      <p>The classification performance shows that the textual
descriptions in our dataset are suficient to predict the
corresponding phase to a certain extent, however in quite
an imbalanced way. In particular, the classification of
phases 1 and 4 achieves moderately good results, while
phase 3 in particular is almost never predicted, despite
the rather balanced distribution of labels in the dataset.</p>
      <p>In order to test the robustness of the Chronos dataset,
we performed cross-validation classification experiments.</p>
      <p>The setting is straightforward: each line of the dataset
is considered independently from one another, and we
apply a supervised classification model to predict the
human-annotated label, i.e., the phase (from 1 to 5).</p>
      <p>
        In this experiments, we ignored lines for which no
textual description is available and we used the chance
baseline of  1 = 0.2. As learning model, we fine-tuned
RoBERTa large1 [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] for the English textual descriptions
and Italian BERT XXL2 for the Italian texts. We used
a learning rate of 10− 6 and applied early stopping and Figure 4: Confusion matrices of the classification of English
model checkpointing, validating each fold on 10% of the (above) and Italian (below) decade descriptions.
      </p>
      <sec id="sec-3-1">
        <title>1https://huggingface.co/FacebookAI/roberta-large</title>
        <p>2https://huggingface.co/dbmdz/bert-base-italian-xxl-cased</p>
      </sec>
      <sec id="sec-3-2">
        <title>The confusion matrices in Figure 4 further highlight</title>
        <p>interesting trends. While the biases of the models in used for the model is shown in Figure 5. No particular
terms of phases are clear, it is worth noticing that mis- decoding strategy was applied for this experiment.
classification happens often between contiguous phases. Despite the dimension of this model, the classification
performance was poor, 5–10 F1 points below the
superStructural Demographic Theory predicts vised classification results at the best try. Interestingly,
outbreaks of political instability in the zero-shot classification exhibited a similar pattern in
complex societies, based on three actors: terms of individual labels, with the model strongly biased
the population, the elite, and the state. towards phase 1 and 4, and unable to properly predict
Epahcahsesd:ecade is associated with one of five phases 2 and 3.</p>
        <p>We suggest that, while phases 1 and 4 have similar
1. The ’growth’ phase, when a fresh types of events in most societies (i.e. reforms or won wars
and effective culture creates social in phase 1, famines or financial problems in phase 4) there
cohesion, the economy is growing rapidly is much more variability for phases 2, 3 and 5. It must be
and the state is expanding its control over noted that these experiments only scratches the surface
the population; of the learning capabilities of the Chronos dataset. In
particular, in this setting, the temporal interdependence
2. The ’population immiseration’ phase, of the decades is not considered, and specific algorithms
when the population continues to grow while should be applied in the future to capture this temporal
the economy slows; structure.
3. The ’elite overproduction’ phase,
when the population tries to access the
elite ranks but overloads the social lift
mechanisms and yields a reduced capability
of the elite to solve problems in the
society;</p>
        <p>This suggests that a more refined, regression-based
learning setting could be more favorable to this kind of
data. Finally, we performed a pilot experiment with a
large language model, namely LlaMa 3 70B3, prompting
the model to elicit zero-shot classifications of the phases
given the textual descriptions in English. The prompt we</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>5. Conclusion and Future</title>
      <sec id="sec-4-1">
        <title>3https://huggingface.co/meta-llama/Meta-Llama-3-70B</title>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>This work was supported by the European Commission
grant 101120657: European Lighthouse to Manifest
Trustworthy and Green AI - ENFIELD.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>F.</given-names>
            <surname>Celli</surname>
          </string-name>
          , E. Stepanov,
          <string-name>
            <given-names>M.</given-names>
            <surname>Poesio</surname>
          </string-name>
          , G. Riccardi,
          <article-title>Predicting brexit: Classifying agreement is better than sentiment and pollsters</article-title>
          ,
          <source>in: Proceedings of the Workshop on Computational Modeling of People's Opinions, Personality, and Emotions in Social Media (PEOPLES)</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>110</fpage>
          -
          <lpage>118</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Lai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Celli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ramponi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tonelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bosco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Patti</surname>
          </string-name>
          , Haspeede3 at evalita 2023:
          <article-title>Overview of the political and religious hate speech detection task</article-title>
          , in: M.
          <string-name>
            <surname>Lai</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Menini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Polignano</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Russo</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Sprugnoli</surname>
          </string-name>
          , G. Venturi (Eds.),
          <source>Proceedings of the Eighth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA</source>
          <year>2023</year>
          ), Parma, Italy,
          <source>September 7th-8th</source>
          ,
          <year>2023</year>
          , volume
          <volume>3473</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Nozza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Bianchi</surname>
          </string-name>
          , G. Attanasio, Hate-ita:
          <article-title>Hate speech detection in italian social media text</article-title>
          ,
          <source>in: Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>252</fpage>
          -
          <lpage>260</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>E. W.</given-names>
            <surname>Pamungkas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. T.</given-names>
            <surname>Cignarella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Basile</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Patti</surname>
          </string-name>
          , et al.,
          <article-title>Automatic identification of misogyny in english and italian tweets at evalita 2018 with a multilingual hate lexicon</article-title>
          ,
          <source>in: CEUR Workshop Proceedings</source>
          , 1,
          <string-name>
            <surname>CEUR-WS</surname>
          </string-name>
          ,
          <year>2018</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S. S.</given-names>
            <surname>Tekiroglu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.-L.</given-names>
            <surname>Chung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Guerini</surname>
          </string-name>
          ,
          <article-title>Generating counter narratives against online hate speech: Data and strategies</article-title>
          , arXiv preprint arXiv:
          <year>2004</year>
          .
          <volume>04216</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R.</given-names>
            <surname>Dalio</surname>
          </string-name>
          ,
          <article-title>Principles for dealing with the changing world order: Why nations succeed or fail</article-title>
          ,
          <source>Simon and Schuster</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Goldstone</surname>
          </string-name>
          ,
          <source>Demographic structural theory: 25 years on, Cliodynamics</source>
          <volume>8</volume>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A. V.</given-names>
            <surname>Korotaev</surname>
          </string-name>
          ,
          <article-title>Introduction to social macrodynamics: Secular cycles and millennial trends in Africa</article-title>
          ,
          <string-name>
            <surname>Editorial</surname>
            <given-names>URSS</given-names>
          </string-name>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>P.</given-names>
            <surname>Turchin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Nefedov</surname>
          </string-name>
          ,
          <article-title>Secular cycles</article-title>
          , in: Secular Cycles, Princeton University Press,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>P.</given-names>
            <surname>Turchin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Korotayev</surname>
          </string-name>
          ,
          <article-title>The 2010 structuraldemographic forecast for the 2010-2020 decade: A retrospective assessment</article-title>
          ,
          <source>PloS one 15</source>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>T.</given-names>
            <surname>Piketty</surname>
          </string-name>
          ,
          <article-title>Capital in the twenty-first century</article-title>
          , Harvard University Press,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>D.</given-names>
            <surname>Hoyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Bennett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Whitehouse</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>François</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Feeney</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Levine</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Reddish</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Davis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Turchin</surname>
          </string-name>
          ,
          <article-title>Flattening the curve: Learning the lessons of world history to mitigate societal crises, osf</article-title>
          .io (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>P.</given-names>
            <surname>Turchin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A</given-names>
            <surname>Structural-Demographic Analysis</surname>
          </string-name>
          of American History, Beresta Books Chaplin,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>G.</given-names>
            <surname>Orlandi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hoyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Bennett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Benam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kohn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Turchin</surname>
          </string-name>
          ,
          <article-title>Structural-demographic analysis of the qing dynasty (1644-1912) collapse in china</article-title>
          ,
          <source>Plos one 18</source>
          (
          <year>2023</year>
          )
          <article-title>e0289748</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>R.</given-names>
            <surname>Sprugnoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tonelli</surname>
          </string-name>
          ,
          <article-title>One, no one and one hundred thousand events: Defining and processing events in an inter-disciplinary perspective</article-title>
          ,
          <source>Natural language engineering 23</source>
          (
          <year>2017</year>
          )
          <fpage>485</fpage>
          -
          <lpage>506</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>B. J. Van</given-names>
            <surname>Bavel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. R.</given-names>
            <surname>Curtis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Hannaford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Moatsos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Roosen</surname>
          </string-name>
          , T. Soens,
          <article-title>Climate and society in long-term perspective: Opportunities and pitfalls in the use of historical datasets</article-title>
          ,
          <source>Wiley Interdisciplinary Reviews: Climate Change</source>
          <volume>10</volume>
          (
          <year>2019</year>
          )
          <article-title>e611</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>R.</given-names>
            <surname>Abramitzky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Boustan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Eriksson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Feigenbaum</surname>
          </string-name>
          , S. Pérez,
          <source>Automated linking of historical data, Journal of Economic Literature</source>
          <volume>59</volume>
          (
          <year>2021</year>
          )
          <fpage>865</fpage>
          -
          <lpage>918</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>F.</given-names>
            <surname>Boschetti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Andrea</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Felice</surname>
          </string-name>
          , G. Lebani,
          <string-name>
            <given-names>P.</given-names>
            <surname>Lucia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Paolo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Giulia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Simonetta</surname>
          </string-name>
          , et al.,
          <article-title>Computational analysis of historical documents: An application to italian war bulletins in world war i and ii</article-title>
          ,
          <source>in: Proceedings of the LREC</source>
          <year>2014</year>
          <article-title>Workshop on Language resources and technologies for processing and linking historical documents and archives</article-title>
          (LRT4HDA
          <year>2014</year>
          ), ELRA,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>P.</given-names>
            <surname>Turchin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Whitehouse</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>François</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hoyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Alves</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Baines</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Baker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bartokiak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bates</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bennet</surname>
          </string-name>
          , et al.,
          <article-title>An introduction to seshat: Global history databank</article-title>
          ,
          <source>Journal of Cognitive Historiography</source>
          <volume>5</volume>
          (
          <year>2020</year>
          )
          <fpage>115</fpage>
          -
          <lpage>123</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>F.</given-names>
            <surname>Celli</surname>
          </string-name>
          ,
          <article-title>Feature Engineering for Quantitative Analysis of Cultural Evolution</article-title>
          ,
          <source>Technical Report</source>
          , Center for Open Science,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>R.</given-names>
            <surname>Sprugnoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tonelli</surname>
          </string-name>
          ,
          <article-title>Novel event detection and classification for historical texts</article-title>
          ,
          <source>Computational Linguistics</source>
          <volume>45</volume>
          (
          <year>2019</year>
          )
          <fpage>229</fpage>
          -
          <lpage>265</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>F.</given-names>
            <surname>Del Vigna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cimino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Dell'Orletta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Petrocchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Tesconi</surname>
          </string-name>
          ,
          <article-title>Hate me, hate me not: Hate speech detection on facebook</article-title>
          ,
          <source>in: Proceedings of the first Italian conference on cybersecurity (ITASEC17)</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>86</fpage>
          -
          <lpage>95</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>F.</given-names>
            <surname>Poletto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Basile</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sanguinetti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bosco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Patti</surname>
          </string-name>
          ,
          <article-title>Resources and benchmark corpora for hate speech detection: a systematic review</article-title>
          ,
          <source>Language Resources and Evaluation</source>
          <volume>55</volume>
          (
          <year>2021</year>
          )
          <fpage>477</fpage>
          -
          <lpage>523</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          ,
          <article-title>Roberta: A robustly optimized BERT pretraining approach</article-title>
          , CoRR abs/
          <year>1907</year>
          .11692 (
          <year>2019</year>
          ). URL: http: //arxiv.org/abs/
          <year>1907</year>
          .11692. arXiv:
          <year>1907</year>
          .11692.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>