=Paper= {{Paper |id=Vol-3878/23_main_long |storemode=property |title=History Repeats: Historical Phase Recognition from Short Texts |pdfUrl=https://ceur-ws.org/Vol-3878/23_main_long.pdf |volume=Vol-3878 |authors=Fabio Celli,Valerio Basile |dblpUrl=https://dblp.org/rec/conf/clic-it/CelliB24 }} ==History Repeats: Historical Phase Recognition from Short Texts== https://ceur-ws.org/Vol-3878/23_main_long.pdf
                                History Repeats:
                                Historical Phase Recognition from Short Texts
                                Fabio Celli1,* , Valerio Basile2
                                1
                                    Gruppo Maggioli, Via Bornaccino 101, Santarcangelo di Romangna, 47822, Italy
                                2
                                    Università di Torino, Via Pessinetto 12, 10149, Torino, Italy


                                                  Abstract
                                                  This paper introduces a new multi-class classification task: the prediction of the Structural-Demographic phase of historical
                                                  cycles - such as growth, impoverishment and crisis - from text describing historical events. To achieve this, we leveraged
                                                  data from the Seshat project, annotated it following specific guidelines and then evaluated the consistency between three
                                                  annotators. The classification experiments, with transformers and Large Language Models, show that 2 of 5 phases can be
                                                  detected with good accuracy. We believe that this task could have a great impact on comparative history and can be helped
                                                  by event extraction in NLP.

                                                  Keywords
                                                  Cultural Analytics, Structural Demographic Theory, LLMs, NLP for the Humanities,



                                1. Introduction And Background                                                             society and are eligible to become part of the
                                                                                                                           state. Who is considered part of the elite and how
                                In the last decade, at least since Brexit [1], many coun-                                  someone gains or loses elite status depends on
                                tries in the world experienced a generalized polarization                                  the type of government and the power dynamics
                                and phenomena of toxic language online have grown                                          within a society.
                                [2]. Hate speech [3], misogyny [4], conspiracy theories                                  • The state, formed by roughly 2% of the society, is
                                [5] and related phenomena are just visible manifesta-                                      the government that enforces its will and man-
                                tions of deep structural social crises, ushering in periods                                ages resources from the population. It is com-
                                of shifting world order [6]. While crises may appear                                       posed by one or more elite groups, depending on
                                sudden, they are often rooted in underlying factors like                                   the social structure, and it crystallizes the culture
                                demographics, geopolitics, technological advancements,                                     to keep the society alive.
                                and historical-economic cycles. Using scientific method,
                                mathematical modelling and the Structural Demographic The actors interact in five phases during the secular cycle,
                                Theory (SDT) [7] it was possible to formalise secular cy- progressively increasing social and political instability:
                                cles [8], that typically last between 75 to 100 years [9],
                                and predict outbreaks of political instability in complex                                               1. The growth phase. During this phase a fresh and
                                societies based on the rate of past crises [10]. The SDT                                                   effective culture creates social cohesion, the econ-
                                defines three actors and five phases of the secular cycle.                                                 omy is growing rapidly and the state is expand-
                                The three key actors are:                                                                                  ing its control over the population. This leads to
                                                                                                                                           increased economic prosperity and stability but
                                      • The population, which is the source of the so-                                                     raises the problem of sustainability. Periods of
                                          ciety’s resources and manpower, represents ap-                                                   reconstruction immediately following wars, like
                                          proximately 90% of the entire society and is the                                                 post-war Italy in the 1950s, are examples of this
                                          part that follows instructions to produce goods                                                  phase.
                                          and wealth, consuming only a small part of it.                                                2. The population immiseration phase. The pop-
                                      • The elites, who typically cover around 8% of the                                                   ulation continues to grow in number while the
                                          society, are the groups of people in charge of                                                   economy slows down. This happens because over
                                          finding potential solutions to the problems of the                                               the long term the rate of return on capital is typi-
                                                                                                                                           cally greater than the growth rate of population
                                CLiC-it 2024: Tenth Italian Conference on Computational Linguistics,
                                Dec 04 — 06, 2024, Pisa, Italy
                                                                                                                                           salaries [11], as result the elites gets richer and the
                                *
                                  Corresponding author.                                                                                    population gets poorer. Moreover, demography
                                $ fabio.celli@maggioli.it (F. Celli); valerio.basile@unito.it                                              has a strong impact on the wealth of the popu-
                                (V. Basile)                                                                                                lation: the more workers of the same type are
                                € https://github.com/facells/fabio-celli-publications (F. Celli);                                          available, the less likely their wages are to grow.
                                https://www.unito.it/persone/vabasile (V. Basile)
                                 0000-0002-7309-5886 (F. Celli); 0000-0001-8110-6832 (V. Basile)
                                                                                                                                           The  state’s ability to extract resources from the
                                           © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License    population reaches its limits in this phase. This
                                            Attribution 4.0 International (CC BY 4.0).




CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
Figure 1: Time chart depicting the dynamics and phases described by the Structural-Demographic Theory.



       can lead to increasing inequality, and social un-       of historical events, including the French Revolution, the
       rest begins. United States in the 1890s and 1970s       American Civil War [13], the fall of the Qing Dynasty
       are an example of this phase.                           [14], the Russian Revolution and the instability in the US
    3. The elite overproduction phase. The population          in recent years.
       tries to access the elite ranks but overloads the          In this paper we propose a novel multi-class classifi-
       social lift mechanisms and yields a reduced capa-       cation task: given a text describing the historical events
       bility of the elite to solve problems in the society,   of a decade, find the appropriate SDT phase label. To do
       which raise the probability to have societal insta-     so we exploited historical data from the Seshat project,
       bility. USSR in the 1950s and US in the 1990s are       produced textual descriptions for decades in the history
       examples of this phase.                                 of human societies and annotated each decade with SDT
    4. The state stress phase. The state’s ability to gov-     phases following specific annotation guidelines. We com-
       ern the population and foster cooperation be-           puted inter-annotator agreement between 3 annotators
       tween population and elites begins to decline,          and experimented with LLMs in classification. The paper
       and the elites become increasingly fragmented.          is structured as follows: in Section 2 we will describe the
       This can lead to widespread violence and civil          data, the guidelines for the annotation (Section 3), the
       war. Moreover, the state tends to be in financial       classification experiments in Section 4, the conclusion
       distress as a consequence of slowed economy and         and direction for future work in Section 5.
       internal fragmentation, thus any triggering event
       that the state cannot manage can break into a
       crisis. Germany in the 1920s is an example.             2. Data
    5. The crisis, collapse or recovery phase. The state
       is either reformed by the elites who find an agree-     It is not easy to design a dataset for historical data. There
       ment or overthrown by internal or external forces.      are specific datasets for event detection from text [15],
       At the end of this phase a new social equilibrium       for paleoclimatology [16], for census analysis through
       is found and a new period of stability begins,          time [17] and for information extraction from historical
       restarting the cycle. Examples are France in the        documents [18], but there are few long-term historical
       1790s, UK in the 1940s, US in the 1860s under civil     datasets for Structural-Demographic analysis. Crucially
       war and also in 1930s under New Deal reforms.           the Seshat project [19] produced a dataset that contains
                                                               machine-readable historical information about global his-
The dynamics described by the SDT are represented in           tory. The basic concept of Seshat is to provide quanti-
figure 1 [12]. SDT has been used to explain a wide range
Figure 2: Distribution of the sampling zones. There are two sampling zone per World region: North America (US, Mexico),
Oceania (Hawaii, Madang - Papua New Guinea), South America (Ecuador, Peru), Europe (France, Italy), Africa (Egypt, Ghana),
Middle East (Levant, Iraq), Eurasia (Turkey, Siberia), South Asia (Uttar Pradesh - India, Java - Indonesia), East Asia (Henan -
China, Japan)



tative and structured or semi-structured data about the                  and philosophies When possible, report the refer-
evolution of societies, defined as political units (polities)            ences about the information found.
from 35 sampling points across the globe in a time win-
dow from roughly 10000 BC to 1900 CE, sampled with              We also extended the data to include the polities until the
a time-step of 100 years. A sampling frequency of 100           2010s CE. In order to limit the long and time-consuming
years is too much coarse-grained, not suitable to track the     manual data wrangling, we reduced the number of sam-
internal phases of the secular cycle, thus we resampled         pling zones from 35 to 18 but at the same time we kept the
the data with a sampling frequency to 10 years, manu-           original variety of world regions [20]. This, combined
ally integrating data and descriptions from Seshat and          with the extension of the time window, allowed us to
from Wikipedia. To do so, we followed these general             obtain 366 polities (roughly the same number of polities
guidelines:                                                     as Seshat) and 3540 rows with a textual description. We
                                                                will call “Chronos” the dataset we produced. It contains
     • For each polity in Sesaht create a number of rows        the following features:
       to represent each decade. There must be no gaps
       between decades. If needed, add polities to fill               • timestamp of each decade,
       the gaps searching in Wikipedia.                               • the Age indicating the periods of history (prehis-
     • Read the description of the polity provided in                   toric, ancient, medieval, early-modern, modern,
       Seshat, identify dates and map the content to the                post-modern),
       corresponding decade.                                          • the sampling zone as reported in Figure 2,
     • Search Wikipedia to find more information about                • the world regions related to the sampling zones,
       the polity that can be mapped into decades. Fill               • a Polity ID formatted with a standard method:
       in as much decades as possible. When dates are                   2 letters to indicate the area of origin of the
       uncertain within a specific time period, use the                 culture, 3 letters to indicate the name of the
       median decade of that period.                                    polity, 1 letter to indicate the type of soci-
     • Summarize the content to fit about 400 charac-                   ety (c=culture/community; n=nomads; e=empire;
       ters. Focus on the following types of events: wars               k=kingdom; r=republic) and 1 letter to indicate
       or battles; reforms; rulers; population; elites; dis-            the periodization (t=terminal; l=late; m=middle;
       asters or epidemics; alliances or treaties; socio-               e=early; f=formative; i=initial; *=any). For exam-
       conomic context; famines or financial stress;                    ple “EsSpael” is the late Spanish Empire, “ItRomre”
       protests or movements; changes of elite; religions               is the early Roman Republic and “CnWwsk*” is
       the period of the Warring States under the Wei             Trial     Examples      Raters     Labels    K
       Chinese dynasty,                                           base      93            3          5         0.206
     • a short textual description of the decade in Italian       trained   93            3          5         0.455
       and English.                                           Table 1
                                                              Inter-Annotator Agreement (Fleiss’ Kappa) on the annotation
Short texts can contain one or more events and refer-
                                                              of secular cycle phases.
ences. Consider the following examples extracted from
the Chronos dataset:
    1. introduction of iron from Vietnam by 300 BC [Bell-         2. Use polity identifiers to find the start and end
       wood P. 1997. Prehistory of the Indo-Malaysian                points of cultures. The end of a culture represents
       Archipelago: Revised Edition pp. 268-307]. Old                a crisis period.
       Malay as lingua franca.
                                                                  3. Starting from the beginning of a culture, initially
    2. Siege of Constantinople in 626. The Byzantines won.
                                                                     assign the sequence of labels of a standard secu-
       Problems in the succession to the throne: Kavadh II
                                                                     lar cycle model: 1,1,2,2,3,3,4,4,4,5 and then evalu-
       is killed in 628. Years of war with Bizantines had
                                                                     ate whether to keep or change the labels in each
       exhausted the Sasanids who were further weak-
                                                                     decade. It is possible to have longer or shorter
       ened by economic decline; religious unrest and in-
                                                                     cycles. There can be only one label 5 (crisis) per
       creasing power of the provincial landholders. King
                                                                     cycle. A polity can have one or more cycles.
       Yazdegerd III (r. 632-651) could not stand against
                                                                  4. Having in mind the key events in the textual de-
       the Islamic conquest of Persia.
                                                                     scription, select one of the following labels to
Example 1 contains a socio-economic context about the                describe the decade: 1=growth. A society is gener-
Buni culture of Indonesia and example 2 contains events              ally poor when it experiences renewal or change
about war, rulers, socio-economic context, religion and              followed by demographic (but not always terri-
elite change about the late Sasanian Empire. The events              torial or economic) growth. Reforms, alliances,
in the short textual description are specific to the SDT             wars won or similar events are potential indi-
and help annotators in their decisions about the histor-             cators of this phase. 2=impoverishment of the
ical phase labels. For example a good socio-economic                 population. Potential economic and/or territorial
context may be a clue of a growth phase and a disaster               expansion slows while demography continues to
may trigger a crisis phase. For this reason we did not               expand. The elite takes much of the wealth and
exploit the labels proposed in literature, such as second-           defines the status symbols. Stability and exter-
level HTOED categories or the HISTO classes [21]. How-               nal attacks are potential indicators of this phase.
ever, we acknowledge that this is an aspect that requires            3=Overproduction of the elites. The wealthy seek
further research. All events included in the texts were              to translate their wealth into positions of author-
manually detected, and the data collectors were trained              ity and prestige. The population becomes poor.
to recognize key events from the examples provided in                Movements, protests, and wars are potential in-
the literature about SDT [12].                                       dicators of this phase. 4=State stress. The elites
                                                                     want to institutionalize their advantages in the
                                                                     form of low taxes and privileges that lead the
3. Annotation and Evaluation                                         state into fiscal difficulties. Wars, protests and
The main problem with the annotation of phases of histor-            changes in the elite are potential indicators of
ical cycles is its interpretability. While everyone agrees           this phase. 5=Crisis. a triggering event such as
the 1789-1799 period in France was a time of crisis, reach-          a war, revolt, famine or disaster that the state is
ing a consensus on the impact of the 1860s French inter-             unable to manage leads to a new configuration
vention in Mexico proves more difficult. Did it trigger              of society. Emigration of elites, subjugation to
a phase of impoverishment or of elite overproduction?                other societies, civil wars or profound reforms
Moreover, did the rise of Mao Zedong as leader of China              are potential indicators of this phase.
in the 1950s began a phase of growth or continued the             5. Use the progressive order of the phases if no tex-
previous crisis?                                                     tual description is available for the decade.
   We defined the following guidelines for the annotation:        6. Make sure there is a progressive order of the la-
                                                                     bels (e.g. phase 3 must follow phase 2). All labels
    1. Read the textual description to identify key
                                                                     can be repeated in the following decade except
       events: wars, reforms, rulers, population, elites,
                                                                     the crisis phase, which conventionally lasts one
       disasters, epidemics, alliances or treaties, socio-
                                                                     decade.
       economic context, famines or financial stress,
       protests or movements, religions.                        A single annotator annotated the entire corpus, then
                                                               training set.
                                                                  We performed 5-fold cross validation and measured
                                                               the precision, recall, and F1 score of the predicted labels
                                                               compared against the gold standard. Table 2 shows the
                                                               results of the experiments.

                                                                         English
                                                                         Phase      Precision     Recall    F1-score
                                                                         1              0.542      0.486      0.513
                                                                         2              0.338      0.256      0.291
                                                                         3              0.242      0.048       0.080
                                                                         4              0.319      0.601      0.416
                                                                         5              0.330      0.364      0.346
Figure 3: Distribution of the labels in the Chronos dataset.             Italian
                                                                         Phase      Precision     Recall    F1-score
                                                                         1              0.489      0.510      0.499
                                                                         2              0.321      0.211      0.254
we evaluated the annotation with two different trials in-                3              0.191      0.044       0.071
volving students, not expert in history. We compared                     4              0.290      0.660      0.403
a subset of data annotated by two students to the same                   5              0.397      0.186      0.254
subset annotated by the principal annotator. The first
trial was done just following the guidelines after a gen-      Table 2
                                                               Results of 5-fold multiclass classification experiments. Results
eral explanation of the SDT. The second trial was done,
                                                               above the baseline (0.2) are marked in bold.
with different students, following the guidelines after a
training session, where the annotation was discussed and
agreed upon. Results, reported in Table 1, show that with        The classification performance shows that the textual
a training session the agreement rises considerably (from      descriptions in our dataset are sufficient to predict the
slight to moderate). The base agreement level is compara-      corresponding phase to a certain extent, however in quite
ble to the one observed in the annotation of hate speech       an imbalanced way. In particular, the classification of
among 5 trained judges on a non-binary scheme, which           phases 1 and 4 achieves moderately good results, while
obtained a Fleiss K=0.19 [22] [23]. The distribution of the    phase 3 in particular is almost never predicted, despite
labels in the Chronos dataset is depicted in Figure 3. In      the rather balanced distribution of labels in the dataset.
the standard secular cycle model, the stress phase (label
4) is the most common, followed by the crisis phase (label
5), which is the least common. The other three phases
(labels 1, 2, and 3) occur with roughly equal frequency in
the data.


4. Classification and Discussion
In order to test the robustness of the Chronos dataset,
we performed cross-validation classification experiments.
The setting is straightforward: each line of the dataset
is considered independently from one another, and we
apply a supervised classification model to predict the
human-annotated label, i.e., the phase (from 1 to 5).
   In this experiments, we ignored lines for which no
textual description is available and we used the chance
baseline of 𝐹 1 = 0.2. As learning model, we fine-tuned
RoBERTa large1 [24] for the English textual descriptions
and Italian BERT XXL2 for the Italian texts. We used
a learning rate of 10− 6 and applied early stopping and Figure 4: Confusion matrices of the classification of English
model checkpointing, validating each fold on 10% of the (above) and Italian (below) decade descriptions.
1
    https://huggingface.co/FacebookAI/roberta-large
2
    https://huggingface.co/dbmdz/bert-base-italian-xxl-cased     The confusion matrices in Figure 4 further highlight
interesting trends. While the biases of the models in used for the model is shown in Figure 5. No particular
terms of phases are clear, it is worth noticing that mis- decoding strategy was applied for this experiment.
classification happens often between contiguous phases.      Despite the dimension of this model, the classification
                                                          performance was poor, 5–10 F1 points below the super-
    Structural     Demographic      Theory    predicts    vised classification results at the best try. Interestingly,
    outbreaks of political instability in                 the zero-shot classification exhibited a similar pattern in
    complex societies, based on three actors:             terms of individual labels, with the model strongly biased
    the population, the elite, and the state.
                                                          towards phase 1 and 4, and unable to properly predict
    Each decade is associated with one of five
    phases:
                                                          phases 2 and 3.
                                                             We suggest that, while phases 1 and 4 have similar
    1.      The ’growth’ phase, when a fresh              types of events in most societies (i.e. reforms or won wars
    and     effective    culture     creates     social   in phase 1, famines or financial problems in phase 4) there
    cohesion, the economy is growing rapidly              is much more variability for phases 2, 3 and 5. It must be
    and the state is expanding its control over           noted that these experiments only scratches the surface
    the population;                                       of the learning capabilities of the Chronos dataset. In
                                                          particular, in this setting, the temporal interdependence
    2.     The ’population immiseration’ phase,           of the decades is not considered, and specific algorithms
    when the population continues to grow while
                                                          should be applied in the future to capture this temporal
    the economy slows;
                                                          structure.
      3.    The ’elite overproduction’ phase,
      when the population tries to access the
      elite ranks but overloads the social lift
                                                              5. Conclusion and Future
      mechanisms and yields a reduced capability
                                                              We introduced a new classification task named historical
      of the elite to solve problems in the
      society;
                                                              phase recognition. We believe that, once we improve
                                                              their performance, classification algorithms trained for
      4.   The ’state stress’ phase, when the                 this task will allow us to automatically annotate many
      state’s ability to govern the population                more polities with secular cycles with a potential disrup-
      and foster cooperation between population               tive improvement in the study of comparative history.
      and elites begins to decline, and the                   We believe that inter-annotator agreement can be fur-
      elites become increasingly fragmented;                  ther improved by having domain experts annotate the
                                                              data. Additionally, the automatic extraction of events
      5.   The ’crisis, collapse or recovery’                 from short historical texts, or the definition of guidelines
      phase, when the state is either reformed
                                                              for their annotation, can be a valuable tool both in the
      by the elites or overthrown by internal or
                                                              annotation and classification tasks. By combining these
      external forces;
                                                              two approaches, we can improve the dataset and make it
      Act as a highly intelligent historian                   more reliable.
      chatbot. You will be given the description                 For the future we plan to improve the performance of
      of a decade and you are asked to predict                classification by including the temporal interdependence
      the phase number.   Please output only a                factors, and to improve the inter annotator agreement,
      number from 1 to 5.                                     also calculating the agreement between labels generated
                                                              by models and by humans. In the future it would be
      Decade: textual description                             interesting to add event structure annotations such as
                                                              TimeML in Chronos. The poor performance in zero-shot
      Phase:
                                                              classification using an LLM is likely a function of the
Figure 5: Prompt for zero-shot classification experiments     sophisticated reasoning and world knowledge required
with LlaMa70B.                                                to perform the task. The LLM could benefit from more
                                                              advanced prompting strategies (e.g. few-shot or chain-of-
   This suggests that a more refined, regression-based        thoughts) or even supervision in the form of fine-tuning.
learning setting could be more favorable to this kind of         The Chronos dataset is accessible online in viewer/-
data. Finally, we performed a pilot experiment with a         commenter mode4 . Edit and download access is available
large language model, namely LlaMa 3 70B3 , prompting         under request.
the model to elicit zero-shot classifications of the phases
                                                              4
given the textual descriptions in English. The prompt we          https://docs.google.com/spreadsheets/d/
                                                                  1OW6CtmUudN3WTJ1VvWRZYZdTWVEjDJGns6Q8_I6EBwk/
3
    https://huggingface.co/meta-llama/Meta-Llama-3-70B            edit?usp=sharing
Acknowledgments                                                     Flattening the curve: Learning the lessons of world
                                                                    history to mitigate societal crises, osf.io (2022).
This work was supported by the European Commission             [13] P. Turchin, A Structural-Demographic Analysis of
grant 101120657: European Lighthouse to Manifest Trust-             American History, Beresta Books Chaplin, 2016.
worthy and Green AI - ENFIELD.                                 [14] G. Orlandi, D. Hoyer, H. Zhao, J. S. Bennett, M. Be-
                                                                    nam, K. Kohn, P. Turchin, Structural-demographic
                                                                    analysis of the qing dynasty (1644–1912) collapse
References                                                          in china, Plos one 18 (2023) e0289748.
 [1] F. Celli, E. Stepanov, M. Poesio, G. Riccardi, Pre-       [15] R. Sprugnoli, S. Tonelli, One, no one and one hun-
     dicting brexit: Classifying agreement is better than           dred thousand events: Defining and processing
     sentiment and pollsters, in: Proceedings of the                events in an inter-disciplinary perspective, Nat-
     Workshop on Computational Modeling of People’s                 ural language engineering 23 (2017) 485–506.
     Opinions, Personality, and Emotions in Social Me-         [16] B. J. Van Bavel, D. R. Curtis, M. J. Hannaford,
     dia (PEOPLES), 2016, pp. 110–118.                              M. Moatsos, J. Roosen, T. Soens, Climate and so-
 [2] M. Lai, F. Celli, A. Ramponi, S. Tonelli, C. Bosco,            ciety in long-term perspective: Opportunities and
     V. Patti, Haspeede3 at evalita 2023: Overview of the           pitfalls in the use of historical datasets, Wiley In-
     political and religious hate speech detection task, in:        terdisciplinary Reviews: Climate Change 10 (2019)
     M. Lai, S. Menini, M. Polignano, V. Russo, R. Sprug-           e611.
     noli, G. Venturi (Eds.), Proceedings of the Eighth        [17] R. Abramitzky, L. Boustan, K. Eriksson, J. Feigen-
     Evaluation Campaign of Natural Language Process-               baum, S. Pérez, Automated linking of historical data,
     ing and Speech Tools for Italian. Final Workshop               Journal of Economic Literature 59 (2021) 865–918.
     (EVALITA 2023), Parma, Italy, September 7th-8th,          [18] F. Boschetti, C. Andrea, D. Felice, G. Lebani, P. Lucia,
     2023, volume 3473 of CEUR Workshop Proceedings,                P. Paolo, V. Giulia, M. Simonetta, et al., Computa-
     CEUR-WS.org, 2023.                                             tional analysis of historical documents: An appli-
 [3] D. Nozza, F. Bianchi, G. Attanasio, Hate-ita: Hate             cation to italian war bulletins in world war i and
     speech detection in italian social media text, in: Pro-        ii, in: Proceedings of the LREC 2014 Workshop on
     ceedings of the Sixth Workshop on Online Abuse                 Language resources and technologies for process-
     and Harms (WOAH), 2022, pp. 252–260.                           ing and linking historical documents and archives
 [4] E. W. Pamungkas, A. T. Cignarella, V. Basile, V. Patti,        (LRT4HDA 2014), ELRA, 2014.
     et al., Automatic identification of misogyny in en-       [19] P. Turchin, H. Whitehouse, P. François, D. Hoyer,
     glish and italian tweets at evalita 2018 with a multi-         A. Alves, J. Baines, D. Baker, M. Bartokiak, J. Bates,
     lingual hate lexicon, in: CEUR Workshop Proceed-               J. Bennet, et al., An introduction to seshat: Global
     ings, 1, CEUR-WS, 2018, pp. 1–6.                               history databank, Journal of Cognitive Historiogra-
 [5] S. S. Tekiroglu, Y.-L. Chung, M. Guerini, Generating           phy 5 (2020) 115–123.
     counter narratives against online hate speech: Data       [20] F. Celli, Feature Engineering for Quantitative Anal-
     and strategies, arXiv preprint arXiv:2004.04216                ysis of Cultural Evolution, Technical Report, Center
     (2020).                                                        for Open Science, 2022.
 [6] R. Dalio, Principles for dealing with the changing        [21] R. Sprugnoli, S. Tonelli, Novel event detection and
     world order: Why nations succeed or fail, Simon                classification for historical texts, Computational
     and Schuster, 2021.                                            Linguistics 45 (2019) 229–265.
 [7] J. A. Goldstone, Demographic structural theory: 25        [22] F. Del Vigna, A. Cimino, F. Dell’Orletta, M. Petroc-
     years on, Cliodynamics 8 (2017).                               chi, M. Tesconi, Hate me, hate me not: Hate speech
 [8] A. V. Korotaev, Introduction to social macrodynam-             detection on facebook, in: Proceedings of the first
     ics: Secular cycles and millennial trends in Africa,           Italian conference on cybersecurity (ITASEC17),
     Editorial URSS, 2006.                                          2017, pp. 86–95.
 [9] P. Turchin, S. A. Nefedov, Secular cycles, in: Secular    [23] F. Poletto, V. Basile, M. Sanguinetti, C. Bosco,
     Cycles, Princeton University Press, 2009.                      V. Patti, Resources and benchmark corpora for hate
[10] P. Turchin, A. Korotayev, The 2010 structural-                 speech detection: a systematic review, Language
     demographic forecast for the 2010–2020 decade:                 Resources and Evaluation 55 (2021) 477–523.
     A retrospective assessment, PloS one 15 (2020).           [24] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen,
[11] T. Piketty, Capital in the twenty-first century, Har-          O. Levy, M. Lewis, L. Zettlemoyer, V. Stoyanov,
     vard University Press, 2014.                                   Roberta: A robustly optimized BERT pretraining
[12] D. Hoyer, J. S. Bennett, H. Whitehouse, P. François,           approach, CoRR abs/1907.11692 (2019). URL: http:
     K. Feeney, J. Levine, J. Reddish, D. Davis, P. Turchin,        //arxiv.org/abs/1907.11692. arXiv:1907.11692.