<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Literary Time Travel: Distinguishing Past and Contemporary Worlds in Danish and Norwegian Fiction</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jens Bjerring-Hansen</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>AliAl-Laith</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>DanielHershcovich</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>AlexanderConroy</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sebastian Ørtoft Rasmussen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Comparative Literature and Rhetoric, Aarhus University</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Computer Science, University of Copenhagen</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Department of Nordic Studies and Linguistics, University of Copenhagen</institution>
        </aff>
      </contrib-group>
      <fpage>772</fpage>
      <lpage>787</lpage>
      <abstract>
        <p>The classification of historical and contemporary novels is a nuanced task that has traditionally relied on expert literary analysis. This paper introduces a novel dataset comprising Danish and Norwegian novels from the last 30 years of the 1th9century, annotated by literary scholars to distinguish between historical and contemporary works. While this manual classification is time-consuming and subjective, our approach leverages pre-trained language models to streamline and potentially standardize this process. We evaluate their efectiveness in automating this classification by examining their performance on titles and the first few sentences of each novel. After fine-tuning, the models show good performance but fail to fully capture the nuanced understanding exhibited by literary scholars. This research underscores the potential and limitations of NLP in literary genre classification and suggests avenues for further improvement, such as incorporating more sophisticated model architectures or hybrid methods that blend machine learning with expert knowledge. Our findings contribute to the broader field of computational humanities by highlighting the challenges and opportunities in automating literary analysis.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Historical Text</kwd>
        <kwd>Text Classification</kwd>
        <kwd>Danish</kwd>
        <kwd>Norwegian</kwd>
        <kwd>Literature</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>a late 18th-century setting. In contrast, Albert GnudtzmanR’sidder Thorvald. En lille
københavnsk Roman (Knight Thorvald. A small Copenhagen novel, 1899) initially misleads with the
historical-sounding keyword “knight” in the title, but the opening scene set in a modern urban
café distinctly establishes it as a contemporary novel.</p>
      <p>
        The question of whether a novel is set in modern days or historical times (contemporary or
historical novel?) was by no means uncontroversial in the time of the so-called Modern
Breakthrough in Scandinavian literature circa 1870-19006][. On the contrary, it was a question
of taste (good vis-à-vis bad) and, accordingly, a detection and quantification of the historical
novel give insight into the cultural divides of the period. Modern realist aesthetics ostracized
the historical novel and insisted that literature should be situated in the present and address
current problems. In 1871, famously and characteristically, the influential Danish critic Georg
Brandes, referring to Scott’s Waverley novels from the early 1800s, rejected the historical novel
as “an unfortunate and now abandoned genre, imported from Scotland and invented by a
pureblooded Tory, which originated in a state of mind similar to ours, one with all its ideals in the
past” [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. And at least for a while, the historical novel was aesthetically and socially demoted
to the realm of popular literature, which no one, except for readers and consumers, cared about.
In the 20th century advanced definitions of the historical novel and its complex relationship
to its political contexts and to the development of the genre towards realism and modernism
have been major points of discussion in the historiography of the novel. In our paper, we
approach the question more directly by tentatively putting ourselves in the place of the
historical actors, professional tastemakers like Brandes or conventional consumers in the literary
marketplace, and emphasizing some immediate, easily decodable genre signals from paratext
(titles and subtitles) and text (the opening of the novels).
      </p>
      <p>We introduce a dataset of Danish and Norwegian novels from the last 30 years of the 1800s,
annotated by literary scholars according to whether they are historical or contemporary. The
novels are taken from the MeMo (Measuring Modernity) corpu5s],[comprising 859 novels.
We assess the ability of language models to generalize this generic distinction as expressed in
their titles and first few sentences, by training them on a portion of the dataset and evaluating
them on unseen novels. While fine-tuned Danish language models show good performance in
the task, error analysis reveals they still lack sensitivity to salient cues that literary scholars
observe.1</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Text classification is a pivotal task in natural language processing (NLP) that entails
categorizing text into predefined labels or classes. It has a broad spectrum of applications, including
sentiment analysis [1], word sense disambiguation 2[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], named entity recognition 1[
        <xref ref-type="bibr" rid="ref19 ref2 ref4">2, 4, 19</xref>
        ],
and genre classification [
        <xref ref-type="bibr" rid="ref21 ref33">32, 21</xref>
        ]. With the advent of pre-trained language models like BERT
[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], GPT [
        <xref ref-type="bibr" rid="ref36">35</xref>
        ], and their variants, significant advancements have been achieved in this
domain. These models leverage extensive text corpora to enhance the understanding of context
and semantics [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ], establishing new standards in accuracy and robustness. Consequently,
they enable more nuanced and sophisticated text classification systems capable of handling
1Our dataset, code and models will be made publicly available upon publication.
diverse and complex textual data. Current research continues to investigate enhancements in
model architectures, fine-tuning techniques, and domain-specific adaptations to further boost
the performance of text classification tasks.
      </p>
      <p>
        When dealing with literature, in academic as well as everyday contexts, taxonomic thinking
and practices seem both habitual and inevitable. Literary genre studies got underway with
the Ancient Greeks, from which the division of poetic literature in three main genres: lyric,
epic and drama, often ascribed to Aristotle and hisPoetics, has proliferated1[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Since then an
enormous and ever-growing body of genre theory has develop1e4d][. Of special interest to us
is:
1. scholarship on the historical novel of theth19century which often serves as the
predecessor and/or antidote to the modern realist (and contemporary set) nov2e5l, 1[
        <xref ref-type="bibr" rid="ref3 ref41 ref48">3, 40,
47</xref>
        ],
2. historical studies, influenced by the sociology of literature and the history of the book,
concerned with genre fiction and its aesthetic and commercial development in the 1th9
century [
        <xref ref-type="bibr" rid="ref16 ref37">36, 16</xref>
        ], and
3. (non-digital) quantitative approaches to the history of the nove2l8,[
        <xref ref-type="bibr" rid="ref15 ref30 ref34">33, 30, 15</xref>
        ].
      </p>
      <p>
        Within computational literary studies of recent years, genre has been an important touch
point for NLP approaches and literary theory and historiography. Text genre classification is a
crucial area of research that aids in systematically categorizing vast and diverse collections of
literary works. This task involves distinguishing between various genres such as fiction and
non-fiction [
        <xref ref-type="bibr" rid="ref46 ref47">46, 45, 34</xref>
        ], poetry [
        <xref ref-type="bibr" rid="ref38">37</xref>
        ], and drama [
        <xref ref-type="bibr" rid="ref39">38</xref>
        ], among others, within literary corpora.
Also, significant eforts have been made to classify novels in various sub-genres,
predominantly with a focus on volume-level similarity across a range of features that capture significant
generic aspects [
        <xref ref-type="bibr" rid="ref45 ref47 ref8">46, 8, 44</xref>
        ]. The advancements facilitated by NLP techniques and machine
learning, including predictive modeling, are substantial, resulting in more accurate and automated
genre classification while also embracing notions from contemporary literary scholarship that
a literary genre comprises many features rather than a single defining characterist3ic9,[
        <xref ref-type="bibr" rid="ref23 ref45">44,
23</xref>
        ]. As Ted Underwood has argued, “[t]he best way to measure the diferentiation between
literary genres is probably to train supervised predictive models that attempt to distinguish
works in one genre from other works in a given period or cultural mil3i9eu].” [
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Historical vs. Contemporary Novels</title>
      <p>
        There is a long and intensive research tradition that has been interested in the political and
social-historical implications of the historical novel of thteh 1c9entury and the decline of the
genre in the latter part of the century as a reflex of a new aesthetic positions with a primacy
of immediate perception and contemporaneity2[
        <xref ref-type="bibr" rid="ref2 ref29 ref5">5, 2, 29</xref>
        ]. In this context, complex definitions
have been drafted on the basis of intensive close readings of particularly British, French and
Russian novels. Lukacs’ five principal claims about the genre is a pioneering example of this
[
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. However, in practice, the question of categorization poses fewer problems for both literary
scholars and customers at bookstores. If you consider a common definition of the genre, such
as this one from a literary reference work, it will correspond to most readers’ common and
more or less reflected perception of the genre:
A novel in which the action takes place during a specific historical period well
before the time of writing (often one or two generations before, sometimes several
centuries), and in which some attempt is made to depict accurately the customs
and mentality of the period2.
      </p>
      <p>In this paper, we construct a dataset using the genre classification of the novels of the MeMo
corpus performed by Bjerring-Hansen and Ørto6ft[], which has followed such pragmatic and
intuitive understanding of the genre as something that can be decoded immediately by a quick
inspection of the temporal coordinates of the individual texts. To carry out an analogous
quantification of genre trends and proportions between historical and contemporary novels, the
authors performed close readings of both (certain) paratexts and (particular) parts of each novel.
More specifically, the annotation was carried out on the following premises:
1. Many historical novels “reveal” themselves already in the title (as is the case
wDirtohnning Caroline Mathilde af Danmark = Queen Mathilde of Denmark by the pseudonym
Caja from 1889) or in the subtitle (as is the case with the anonymously publishCeadroline.
Bøhmens frygtelige Svøbe eller et Gammel Bjergslots Hemmelighed: Historisk-romantisk
Fortaelling = Caroline. Böhmen’s terrible scourge or the secret of an old mountain
castle; Historical-romantic tale), or more redundantly both the one and the other (cf. H.F.</p>
      <p>Ewald’sGrifenfeld. Historisk Roman = Grifenfeld. Historical Novel from 1888).
2. If the titles do not contain clear paratextual signals, genre afÏliation is often indicated
on the first few pages of the novel (as, for example, in the case ofIndianerpigen fra Cape
Breton = The Indian Girl from Cape Breton by “L.M”, which, although the title page does
not indicate that we are dealing with a historical novel, immediately sets the temporal
scene with the opening sentence: “It was in the year 1780 [...]”).</p>
      <p>
        So, generally, it is striking to what extent the historical novels of theth1c9entury clearly
and actively give away their generic afÏliation. As several literary scholars have pointed out,
this is probably because the historical novel’s foremost characteristic—and selling point—is
its historical setting, which, then, producers and distributors clearly wants to mark for the
intended readership and therefore already on the title page or the first pages “come clea4n7”,[
        <xref ref-type="bibr" rid="ref43">42</xref>
        ]. It can be added that these guidelines only to a very limited extent can be reversed on
the basis of a similar “scanning” of the paratextual and textual evidence. Non-historical, i.e.
contemporary novels—the novels which aesthetics and the criticism in the late 1800s placed a
decisive and favourable emphasis on—do not communicate their temporality in a similar way.
The MeMo corpus entails a few handfuls of instances of emphatically contemporary subtitles
(e.g. “Nutidsfortaelling” = story from the present day, “Samtidsroman” = contemporary novel
etc.), but in general the contemporary novels are implying their genericity through silence on
their temporal setting.
      </p>
      <p>This transparent literary communication, or consumer information, which is of course less
obvious in the few and often canonized instances of experimental novels that ”play” with genre
ifction such as the historical novel, can be said to be a general feature of popular literature,
including also romances and detective stories etc., and the genre-fiction system, developing</p>
      <p>Total novels 859
Total sentences 3,282,643
Total words 53,588,381
Average sentences per novel 3,821
Average words per novel 62,385</p>
      <p>
        Average words per sentence 16.3
in the latter part of the 1t9h century [
        <xref ref-type="bibr" rid="ref16 ref28">28, 16</xref>
        ]. The question is whether machines can learn to
read these literary and cultural signals, apparent in the paratext and/or the opening pages of
the novel, which for historical actors have seemed quite obvious? (Non-)Historical novel – yes
or no? In other words, the genre distinctions that our method rely on are historically framed,
meaning they are tied to specific periods and cultural contexts rather than having universal
relevance across time and place.
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Methodology</title>
      <sec id="sec-4-1">
        <title>4.1. Dataset</title>
        <p>To address this question, we treat the problem from a machine learning perspective. We
introduce an annotated corpus and fine-tune pre-trained transformer language models on it,
evaluating their performance on a held-out test set.</p>
        <p>
          We rely on the MeMo corpus5[], comprising 859 Danish and Norwegian novels spanning the
last 30 years of the 19th century, with more than 64 million tokens. The corpus is a rich and
diverse collection of texts that provides valuable insights into the classification of novels as
historical and contemporary during the period under investigation. Ta1bslheows statistical
information about the corpus. We obtain the annotated dataset of novels from Bjerring-Hansen
and Ørtoft [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. The final list of annotated novels consists of 859 novels, with 78% categorized as
contemporary and 22% as historical. Figur1eillustrates the temporal distribution of historical
and contemporary novels in our corpus.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Novel Classification</title>
        <p>We use the dataset for training and evaluating transformer-based language models. Specifically,
inspired by the observations made by Bjerring-Hansen and Ørtoft, we consider three settings:
1. Providing the title and sub-title of the novel as input to the model,
2. Providing the first 15 sentences of the novel as input to the model,
3. Concatenating the title, sub-title and first 15 sentences and providing them to the model.
In all cases, we train the model to classify the novel according to the binary label obtained from
the annotated dataset. Subsequently, we evaluate the models on a test set of held-out novels
to assess their ability to generalize the ability to identify the cues learned during training.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Experiments and Results</title>
      <p>We experiment with four pre-trained language models and three types of provided context,
comparing their performance and using them as the basis for an elaborate analysis of errors
and indicative features.</p>
      <sec id="sec-5-1">
        <title>5.1. Pre-trained Language Models</title>
        <p>
          The models evaluated in our novel classification experiments had been pre-trained on text
corpora including Danish and Norwegian text. We train them on the task using supervised
finetuning. Importantly, all models are selected based on their performance evaluated on Danish
and Norwegian literary benchmark datasets22[], the Scandinavian Embedding Benchmark3
and ScandEval4, [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ] even though these models had not been trained primarily on historical
Danish or Norwegian. We additionally experiment with a model (MeMo-BERT-03) specifically
adapted for the MeMo corpus.
        </p>
        <p>
          DanskBERT. DanskBERT,5 a top-performing Danish language model noted for its success
on the ScandEval benchmark 4[1], is based on the XLM-RoBERTa architecture and trained
on the Danish Gigaword Corpus4[
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. It features 24 layers, a hidden dimension of 1024, 16
3https://kennethenevoldsen.github.io/scandinavian-embedding-benchmark/
4https://scandeval.com/
5https://huggingface.co/vesteinn/DanskBERT
attention heads, and a subword vocabulary of 250,000. The model was trained with a batch
size of 2,000 for 500,000 steps on 16 V100 GPUs over two weeks.
        </p>
        <sec id="sec-5-1-1">
          <title>Danish Foundation Models sentence encoder. A sentence-transformers model1[1]</title>
          <p>based on the BERT architecture, featuring 24 layers, 16 attention heads, and a hidden size
of 1024. It incorporates a dropout rate of 0.1 for attention probabilities and hidden states, using
GELU activation and supporting up to 512 position embeddings. With a vocabulary size of
50,000 tokens, this model, referred to as DFM (Large), excels in tasks such as Danish sentiment
analysis and named entity recognition6.</p>
          <p>
            MeMo-BERT-03. Developed by continuing the pre-training of the pre-trained Transformer
language model DanskBERT 2[
            <xref ref-type="bibr" rid="ref2">2</xref>
            ].7 This foundation allows MeMo-BERT-3 to leverage
extensive linguistic knowledge for NLP tasks in historical literary Danish including sentiment
analysis and word sense disambiguation. The model outperformed diferent models in sentiment
analysis and word sense disambiguation tasks2[
            <xref ref-type="bibr" rid="ref2">2</xref>
            ].
          </p>
          <p>NB-BERT-base. A general-purpose BERT-base model was developed using the extensive
digital collection at the National Library of Norw2a0y].[8 It follows the architecture of the
BERT Cased multilingual model and has been trained on a diverse range of Norwegian texts,
encompassing both Bokmål and Nynorsk from the past 200 years. This comprehensive
training allows the NB-BERT-base to efectively handle a wide array of NLP tasks in Norwegian.
The model achieved the second-highest performance ranking in the Norwegian Named Entity
Recognition task compared to other models listed on the ScandEval benchmark for Norwegian
natural language understanding.</p>
        </sec>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Experimental Setup</title>
        <p>
          Our experiments involve fine-tuning the pre-trained language models on the annotated novels
from our corpus. To enable testing of generalization in the face of temporal sh26ift ][, the last
130 novels according to publication yea≈r1(5%) are used as a testing set, while the remaining
novels were randomly divided into training and validation with 70% and 15% respectively. The
experiments involve fine-tuning the models on the dataset using a batch size of 32, training
for 20 epochs with the AdamW optimizer2[
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] at a learning rate of10−3. During training, we
monitor the performance on the validation set to assess model convergence and to prevent
overfitting, keeping the checkpoint with the best validation score. For evaluation, we employ
the F1-score metric due to its ability to balance precision and recall, particularly efective for
tasks with imbalanced datasets. The performance of each model is evaluated on both
validation and test sets, ensuring the robustness and generalizability of the models across diferent
datasets and epochs. For comparison, due to the imbalanced nature of the dataset, with 22% of
novels being historical overall and the percentage being 17% in the test set, a naive baseline that
selects a label based on the training distribution would achieve about 70% weighted F1-score.
6https://huggingface.co/KennethEnevoldsen/dfm-sentence-encoder-large-exp2-no-lang-align
7https://huggingface.co/MiMe-MeMo/MeMo-BERT-03
8https://huggingface.co/NbAiLab/nb-bert-base
        </p>
      </sec>
      <sec id="sec-5-3">
        <title>5.3. Novels Classification Experiments</title>
        <sec id="sec-5-3-1">
          <title>5.3.1. Titles and Sub-titles Classification</title>
          <p>In this experiment, we concatenate the title and subtitle of each novel and perform classification
by fine-tuning the aforementioned pre-trained language models with the novel labels, using the
cross-entropy objective. Table2 (left) presents the fine-tuning results of the selected models.
DFM (Large) achieved the highest performance on the validation set with an F1-score of 92%,
while the MeMo-BERT-03 model excelled on the testing set with an F1-score of 91%.</p>
        </sec>
        <sec id="sec-5-3-2">
          <title>5.3.2. First 15 Sentences Classification</title>
          <p>
            We use the Danish pipeline in spaCy 1[
            <xref ref-type="bibr" rid="ref8">8</xref>
            ] for sentence segmentation and extract the first 15
sentences from each novel. We then use each sentence as a separate input instance for
finetuning the aforementioned pre-trained language models, with the same novel-level labels as
previously now inducing sentence-level labels. To predict novel-level labels using the
finetuned models, we apply them to the first 15 sentences of a (validation or testing) novel, and use
majority voting to determine the novel-level predictions.
          </p>
          <p>The results of fine-tuning the models is shown in Table2 (middle). DFM (Large) and
MeMoBERT-03 achieved the highest performance on the validation set with an F1-score of 81%, while
DFM (Large) excelled on the testing set with an F1-score of 86%. Notably, for all models, using
the first 15 sentences as input performs worse than using the title and sub-title.</p>
        </sec>
        <sec id="sec-5-3-3">
          <title>5.3.3. Both Titles &amp; Sub-titles and First 15 Sentences Classification</title>
          <p>In this experiment, we combine both the titles &amp; sub-titles and the first 15 sentences of each
novel in the corpus: technically, we repeat the same setup as using the first 15 sentences, but
additionally prepend the concatenated title and sub-title as if they were another sentence. The
ifne-tuning results of the four models of this experiments are shown in Tab2le(right). DFM
(Large), MeMo-BERT-03 and NB-BERT-Base achieve equal performance on the validation set
with an F1-score of 82%, while DanskBERT and NB-BERT-Base perform best on the testing set
with an F1-score of 84%. Overall, performance in this setting is similar to just using the first
15 sentences, but the best performance on the test set is in fact obtained when just using titles
and sub-titles, and ignoring the first 15 sentences.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Discussion</title>
      <p>While all models are highly accurate after fine-tuning, surprisingly, we observe that the best
predictions are obtained by just using the title and sub-title as input, disregarding the first 15
sentences of the novel. This suggests either that the genre information is less salient in the first
15 sentences, or that the models are not as capable of extracting it from them. To analyze this
further, we investigate model confidence on mislabeled predictions, and perform a fine-grained
error analysis.</p>
      <sec id="sec-6-1">
        <title>6.1. Model Calibration</title>
        <p>
          When reading the text opening, introspection from expert annotation reveals that genre
identification often hinges on specific key sentences (e.g., mentioning specific entities, events, or
years) rather than the entire opening passage. While most sentences in the opening text do
not clearly suggest one genre or another, these ”giveaways” are sparse but salient. This nuance
may be lost by the majority voting procedure over sentences, leading to misclassifications when
the first 15 sentences are used as input. Therefore, we are interested in the model’s confidence
and whether it is calibrated to match the experts’ uncertainty or disagreement about the labels
[
          <xref ref-type="bibr" rid="ref3">3</xref>
          ].
        </p>
        <p>To evaluate model calibration, we use Expected Calibration Error (ECE), a metric that
measures how well the model’s predicted probabilities reflect the true accura9c]y: [
 |  |
∑
=1</p>
        <p>=
| (
 ) −  (
 )|
bin  ,  (
 ) is the accuracy in bin , and  (</p>
        <p>) is the average confidence in bin  .
where  = 10 is the number of bins (confidence intervals),|  | is the number of samples in
When using titles and sub-titles as input, the best-performing model (MeMo-BERT-03)
achieved a relatively low ECE of 0.062, as shown in Tab3le,indicating that it was reasonably
well-calibrated when relying on paratextual information. Titles and sub-titles often contain
clear genre markers that allow the model to make high-confidence, mostly accurate
predictions. However, the model still made misclassifications, particularly when historical-sounding
titles misled the model. Despite this, the model’s overall confidence generally matched its
performance in this setting, and it exhibited the lowest calibration error compared to other
settings. This result underscores the strength of using paratextual clues, though it also reveals
that misleading terms in the title can cause overconfidence in wrong predictions.</p>
        <p>In contrast, when the models used the first 15 sentences of the novels as input, calibration
worsened across all models. For example, the ECE for DFM (Large) increased to 0.085, and
other models similarly struggled with higher calibration errors (see 3T)a.blTehis is likely
because, as noted in expert analyses, genre signals are not uniformly distributed across the
opening sentences. Instead, they tend to appear in specific key sentences that reveal important
genre-relevant details. In many cases, the first 15 sentences are ambiguous, lacking explicit
time markers or character descriptions, which increases the model’s uncertainty. However,
rather than reflecting this uncertainty in their confidence scores, the models often exhibited
overconfidence, resulting in higher calibration errors. This overconfidence indicates that the
models are not adequately capturing the uncertainty present in the text openings, a discrepancy
that reflects the challenge of extracting nuanced genre information from longer inputs.</p>
        <p>Combining both titles, sub-titles, and the first 15 sentences of the novels did not uniformly
improve calibration. For MeMo-BERT-03, the ECE increased to 0.094, suggesting that
integrating both sources of information did not lead to better confidence alignment. Although the
models had access to more context, they struggled to efectively weigh the paratext against
the more ambiguous textual cues from the opening sentences. In some cases, conflicting
signals between the title and the text may have caused the models to oscillate between genres,
ultimately leading to poorer calibration. DanskBERT, however, exhibited slightly better
calibration in this setting, indicating that it was more adept at integrating the two types of input
compared to other models. This slight improvement over single-input settings suggests that
certain models can benefit from additional context, though the integration process remains
challenging for most.</p>
      </sec>
      <sec id="sec-6-2">
        <title>6.2. Error Analysis</title>
        <p>We discuss the errors encountered during the classification of historical and contemporary
novels, focusing on prediction errors made by the best-performing models, as well as annotation
errors identified by expert inspection of the mislabeled predictions.</p>
        <sec id="sec-6-2-1">
          <title>6.2.1. Prediction Errors</title>
          <p>A notable pattern in the prediction errors is the tendency of the models to misclassify
contemporary novels as historical based on titles (in 12 out of 17 cases where at least one model
misclassified the label in this setting) and based on text openings (first 15 sentences) or the
combination of titles and text openings (11 out of 11 cases). An illustrative example of the first
type of error is Albert Gnudtzman’s urban novReildder Thorvald. En lille københavnsk Roman
(Knight Thorvald. A small Copenhagen novel, 1899). The title’s keyword “knight” leads the
models to misinterpret it as a historical romance. However, the opening scene set in a lively
urban café correctly classifies it as a contemporary novel, highlighting the discrepancy between
title-based and content-based classification.</p>
          <p>When models misinterpret text openings (first 15 sentences), a common issue is their
failure to recognize historical settings established through character introductions rather than
explicit time clues. For instance, in Marie HenckelL’solotte. En Roman fra den Gustavianske
Tid (Lolotte. A Novel from the Gustavian Period, 1898), while the subtitle clearly indicates a
late 18th-century setting, the models fail to date the characters like Prince Gustaf and Sofie
Magdalene, leading to incorrect classifications.</p>
          <p>Machine readings of genre clues also shed light on borderline cases, such as novels set in
the near past relative to the modern breakthrough period (1870-99). Examples include novels
set during the Danish-German wars of 1848-50 and 1864, like Chr. ChristensenK’aesrlighedens
Mysterier. En Historie fra 1848-50 (The Mysteries of Love. A Tale from 1848-50, 1899) and P.A.
Worm’sForbrydelsernes Konge eller Den skalperede Praest (The King of Crime and the Scalped
Priest, 1899). An intriguing case involves a novel beginning in the narrator’s present with
modern elements like electric light and a telephone but transitioning to a historical analepsis:
U. Ravn’s Interioerer fra vores Bedsteforaeldres Tid (Interiors from the Time of our Grandparents,
1899).</p>
        </sec>
        <sec id="sec-6-2-2">
          <title>6.2.2. Annotation Errors</title>
          <p>After in-depth expert analysis, eight of the models’ “errors” in the test set turned out to be
Bjerring-Hansen and Ørtoft’s annotation errors, including four plain mistakes and four tricky
in-between novels where further inspection validated the models’ predictions. These
erroneous annotations were evenly distributed between misclassified historical and contemporary
novels.</p>
          <p>An interesting case is the novelHvorfor hun blev Nonne. En Fortaelling om fransk
Klosterliv (Why she became a nun. A story about French monastic life, 1899) by the pseudonym
“Herdis”. Both models and annotators were misled by the title into thinking it was a medieval
story. However, a close reading of the text’s opening pages revealed it to be set in modern
times, aligning with the neo-romantic current of the 1890s that revived the historical novel
and Catholic themes.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion</title>
      <p>In this study, we presented a dataset of Danish and Norwegian novels from the lateth1c9entury,
classified as historical or contemporary by literary scholars. We investigated the performance
of several pre-trained language models in distinguishing between these two genres based on
titles and the first few sentences. While the models demonstrated commendable accuracy, the
error analysis revealed limitations in capturing the nuanced cues recognized by human
experts. These findings underscore the complexity of literary text classification and suggest that
while NLP models can significantly aid in the categorization process, they still require further
refinement to match human interpretative abilities fully. In our approach, we chose to limit
the textual input to only the titles and the first 15 sentences of each novel. This decision was
informed by the pretraining of the models, which predominantly focused on non-literary and
contemporary content, meaning that historical figures, settings, and subtleties were likely
underrepresented in the training data. As a result, we hypothesized that the opening framing of
the novels, which often serves to establish the genre, would be more efective for detection
than deeper content. This was evident in our findings, where the titles and subtitles emerged
as the most straightforward indicators of the genre distinction. While expanding the analysis
to include more content from the novels could potentially capture stronger genre signals, the
results indicated that the models performed best on the titles and subtitles; in fact, the first
15 sentences did not surpass the efectiveness of the titles alone. This suggests that the model,
like the historical readers, efectively identifies key genre signals from surface-level clues, such
as titles and subtitles. While our method is aligned with historical signals by focusing on
initial clues that reflect the genre distinctions recognized during the period, it also highlights
the need for deeper content analysis, which would require models pretrained on a substantial
amount of historical material extending well before the end of thteh 1c9entury. To address
this, future work could explore more sophisticated model architectures or hybrid approaches
that combine machine learning with expert knowledge to enhance the accuracy and depth of
genre classification in literary studies.</p>
      <p>Our study shows that the historical novel was by no means an extinct genre in
DanishNorwegian literature at the end of the 1th9century, as the prevailing modern aesthetic would
have it, and furthermore that this was a rather obvious fact, since genre decoding–historical
novel – yes or no? – is a relatively trivial afair. It can be determined with great certainty,
both by people and models, by reading the paratext and/or the first lines of the novels. Of
course, this generic stability cannot be taken for granted or universalized if the fine-tuned
models from our study are applied to other textual sources, such tahsc2e0ntury novels, where
genre innovations are increasing and where the historical novel is also exposed to modernist
experiments (an early Danish example of this is Nobel Prize winner Johannes V. Jensen’s novel
Kongens Fald (The fall of the king, 1900-01), which represents both a historical depiction of the
dramatic events leading to the fall of the Kalmar Union and a modeflârnneur novel. In future
literary studies, we will be able to test the stability of the genre distinctions created during
the 19th century, when the literary field and the formation of taste were established, but for
accuracy in prediction, we will most likely have to adjust our methods to consider the content
of novels on a broader scale.</p>
      <p>The comparison between qualitative annotations and machine predictions enhanced our
understanding of the quantitative arguments applicable to the period’s literature. It highlighted
the coexistence and interaction of old and new forms and meanings within an aesthetic
timespan. This approach aligns with a broader perspective on genre classification that leverages
predictive modeling. As Ted Underwood suggests, our objective shifts from defining a genre to
developing a model that can replicate the judgments made by specific historical observer4s4][.
This paradigm not only advances our technical capabilities but also deepens our literary and
historical understanding, bridging the gap between computational methods and humanistic
inquiry.
[1] A. Allaith, K. Degn, A. Conroy, B. Pedersen, J. Bjerring-Hansen, and D. Hershcovich.
“Sentiment Classification of Historical Danish and Norwegian Literary Texts”.
InP:roceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa). Ed. by
that work on artificial languages (programming languages, logics,
formal systems) that does not explicitly address natural-language issues
broadly construed (natural-language processing, computational
linguistics, speech, text retrieval, etc.) is not appropriate for this area.'].</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>T.</given-names>
            <surname>Alumäe</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Fishel</surname>
          </string-name>
          . Tórshavn, Faroe Islands: University of Tartu Library,
          <year>2023</year>
          , pp.
          <fpage>324</fpage>
          -
          <lpage>334</lpage>
          . url: https://aclanthology.org/
          <year>2023</year>
          .nodalida-
          <volume>1</volume>
          ..34
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>P.</given-names>
            <surname>Anderson</surname>
          </string-name>
          . “
          <article-title>From progress to catastrophe”</article-title>
          .
          <source>InL:ondon Review of Books 33.15</source>
          (
          <year>2011</year>
          ), pp.
          <fpage>24</fpage>
          -
          <lpage>28</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Baan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Aziz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Plank</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Fernandez</surname>
          </string-name>
          . “
          <article-title>Stop Measuring Calibration When Humans Disagree”</article-title>
          .
          <source>In: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing</source>
          . Ed. by
          <string-name>
            <given-names>Y.</given-names>
            <surname>Goldberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Kozareva</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          . Abu Dhabi,
          <source>United Arab Emirates: Association for Computational Linguistics</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>1892</fpage>
          -
          <lpage>1915</lpage>
          . do1i0: .
          <volume>18653</volume>
          /v1/
          <year>2022</year>
          .emnlp-main.
          <volume>124</volume>
          . url: https://aclanthology.org/
          <year>2022</year>
          .emnlp-main.
          <volume>12</volume>
          .4
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D.</given-names>
            <surname>Bamman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Popat</surname>
          </string-name>
          , and
          <string-name>
            <surname>S. Shen. “</surname>
          </string-name>
          <article-title>An annotated dataset of literary entities”. InP:roceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</article-title>
          , Volume
          <volume>1</volume>
          (Long and Short Papers).
          <year>2019</year>
          , pp.
          <fpage>2138</fpage>
          -
          <lpage>2144</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bjerring-Hansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. D.</given-names>
            <surname>Kristensen-McLachlan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Diderichsen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D. H.</given-names>
            <surname>Hansen</surname>
          </string-name>
          . “Mending Fractured Texts.
          <article-title>A heuristic procedure for correcting OCR data”</article-title>
          . In: (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bjerring-Hansen</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Ø</surname>
          </string-name>
          . Rasmussen. “
          <article-title>Litteratursociologi og kvantitative litteraturstudier: Den historiske roman i det moderne gennembrud som case”</article-title>
          .
          <source>PInas:sageTidsskrift for litteratur og kritik 38.89</source>
          (
          <year>2023</year>
          ), pp.
          <fpage>171</fpage>
          -
          <lpage>189</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>G.</given-names>
            <surname>Brandes</surname>
          </string-name>
          and
          <string-name>
            <given-names>L. R.</given-names>
            <surname>Wilkinson</surname>
          </string-name>
          . “
          <article-title>The 1872 Introduction to Hovedstrømninger i det 19de Aarhundredes Litteratur (Main Currents of Nineteenth-Century Literature)”</article-title>
          .
          <source>In: PMLA/Publications of the Modern Language Association of America 132.3</source>
          (
          <issue>2017</issue>
          ), pp.
          <fpage>696</fpage>
          -
          <lpage>705</lpage>
          . doi:
          <volume>10</volume>
          .1632/pmla.
          <year>2017</year>
          .
          <volume>132</volume>
          .3.696.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Calvo</surname>
          </string-name>
          <article-title>TelloT. he novel in the Spanish Silver Age: a digital analysis of genre using machine learning</article-title>
          . Bielefeld University Press,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>Desai</surname>
          </string-name>
          and
          <string-name>
            <surname>G. Durrett.</surname>
          </string-name>
          “
          <article-title>Calibration of Pre-trained Transformers”P.rInoc:eedings of the 2020 Conference on Empirical Methods in Natural Language Processing</article-title>
          (EMNLP). Ed. by
          <string-name>
            <given-names>B.</given-names>
            <surname>Webber</surname>
          </string-name>
          , T. Cohn,
          <string-name>
            <given-names>Y.</given-names>
            <surname>He</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          . Online: Association for Computational Linguistics,
          <year>2020</year>
          , pp.
          <fpage>295</fpage>
          -
          <lpage>302</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2020</year>
          .emnlp-main.
          <volume>21</volume>
          . url: https://aclanthology.org /
          <year>2020</year>
          .emnlp-main.
          <volume>21</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          . “BERT:
          <article-title>Pre-training of Deep Bidirectional Transformers for Language Understanding”. PInro:ceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</article-title>
          , Volume
          <volume>1</volume>
          (Long and Short Papers). Ed. by
          <string-name>
            <given-names>J.</given-names>
            <surname>Burstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Doran</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Solorio</surname>
          </string-name>
          . Minneapolis, Minnesota: Association for Computational Linguistics,
          <year>2019</year>
          , pp.
          <fpage>4171</fpage>
          -
          <lpage>4186</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>N19</fpage>
          -1423. url: https://aclanthology.org/N19-142.
          <fpage>3</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>K.</given-names>
            <surname>Enevoldsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Hansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. S.</given-names>
            <surname>Nielsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. A. F.</given-names>
            <surname>Egebaek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. V.</given-names>
            <surname>Holm</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. C.</given-names>
            <surname>Nielsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bernstorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Larsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. B.</given-names>
            <surname>Jørgensen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Højmark-Bertelsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. B.</given-names>
            <surname>Vahlstrup</surname>
          </string-name>
          , P. MøldrupDalum,
          <article-title>and</article-title>
          K. NielboD.anish Foundation Models.
          <year>2023</year>
          . arXiv:
          <volume>2311</volume>
          .07264 [id=
          <article-title>'cs.CL' full_name='Computation and Language' is_active=True alt_name='cmp-lg' in_archive='cs' is_general=False description='Covers natural language processing</article-title>
          .
          <source>Roughly includes material in ACM Subject Class I.2.7</source>
          . Note
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Erdmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Brown</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Joseph</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Janse</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Ajaka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Elsner</surname>
          </string-name>
          , and M.-C. de Marnefe.
          <article-title>“Challenges and solutions for Latin named entity recognition”</article-title>
          .
          <source>InP:roceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH)</source>
          .
          <year>2016</year>
          , pp.
          <fpage>85</fpage>
          -
          <lpage>93</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Fleishman</surname>
          </string-name>
          .
          <article-title>The English historical novel</article-title>
          . Johns Hopkins University Press,
          <year>1971</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>J.</given-names>
            <surname>Frow</surname>
          </string-name>
          .Genre. Routledge,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>G. Furuland.</surname>
          </string-name>
          “
          <article-title>Romanen som vardagsvara: förläggare, författare och skönlitterära häftesserier i Sverige 1833-1851 från Lars Johan Hierta till Albert Bonnier”</article-title>
          .
          <source>PhD thesis</source>
          .
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>A.</given-names>
            <surname>Goldstone</surname>
          </string-name>
          . “
          <article-title>Origins of the US genre-fiction system,</article-title>
          <year>1890</year>
          -
          <fpage>1956</fpage>
          ”.
          <source>In:Book history 26.1</source>
          (
          <issue>2023</issue>
          ), pp.
          <fpage>203</fpage>
          -
          <lpage>233</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>S. HalliwellA.</surname>
          </string-name>
          <article-title>ristotle's poetics</article-title>
          . University of Chicago Press,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>M.</given-names>
            <surname>Honnibal</surname>
          </string-name>
          , I. Montani,
          <string-name>
            <surname>S. Van Landeghem</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Boyd</surname>
          </string-name>
          . “spaCy:
          <article-title>Industrial-strength Natural Language Processing in Python”</article-title>
          . In: (
          <year>2020</year>
          ).
          <year>doi1</year>
          :
          <fpage>0</fpage>
          .5281/zenodo.1212303.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>E.</given-names>
            <surname>Kogkitsidou</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Gambette</surname>
          </string-name>
          . “
          <article-title>Normalisation of 16th and 17th century texts in French and geographical named entity recognition”</article-title>
          .
          <source>InP:roceedings of the 4th ACM SIGSPATIAL Workshop on Geospatial Humanities</source>
          .
          <year>2020</year>
          , pp.
          <fpage>28</fpage>
          -
          <lpage>34</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>P. E.</given-names>
            <surname>Kummervold</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. De la Rosa</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Wetjen</surname>
            , and
            <given-names>S. A.</given-names>
          </string-name>
          <string-name>
            <surname>Brygfjeld</surname>
          </string-name>
          . “
          <article-title>Operationalizing a National Digital Library: The Case for a Norwegian Transformer Model”P.rIonc:eedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa)</article-title>
          . Reykjavik, Iceland (Online): Linköping University Electronic Press, Sweden,
          <year>2021</year>
          , pp.
          <fpage>20</fpage>
          -
          <lpage>29</lpage>
          . urhlt:tps://a clanthology.
          <source>org/2021.nodalida-mai n..3</source>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>D.</given-names>
            <surname>Kurbanova</surname>
          </string-name>
          . “
          <article-title>Genre Classification and the Current State of Turkmen Musical Folklore”</article-title>
          .
          <source>In: Culture and Arts in the Modern World</source>
          <volume>24</volume>
          (
          <year>2023</year>
          ), pp.
          <fpage>155</fpage>
          -
          <lpage>167</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>A.</given-names>
            <surname>Al-Laith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Conroy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bjerring-Hansen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Hershcovich</surname>
          </string-name>
          . “
          <article-title>Development and Evaluation of Pre-trained Language Models for Historical Danish and Norwegian Literary Texts”</article-title>
          .
          <source>In: Proceedings of the 2024 Joint International Conference on Computational Linguistics</source>
          ,
          <article-title>Language Resources and Evaluation (LREC-COLING</article-title>
          <year>2024</year>
          ). Ed. by
          <string-name>
            <given-names>N.</given-names>
            <surname>Calzolari</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Hoste</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lenci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sakti</surname>
          </string-name>
          , and
          <string-name>
            <given-names>N.</given-names>
            <surname>Xue</surname>
          </string-name>
          . Torino, Italia: ELRA and
          <string-name>
            <surname>ICCL</surname>
          </string-name>
          ,
          <year>2024</year>
          , pp.
          <fpage>4811</fpage>
          -
          <lpage>4819</lpage>
          . url: https://aclanthology.org/
          <year>2024</year>
          .lrec-main.
          <volume>4</volume>
          .
          <fpage>31</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>S.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wu</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhang</surname>
          </string-name>
          . “DeepGenre:
          <article-title>Deep Neural Networks for Genre Classification in Literary Works”</article-title>
          .
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>I.</given-names>
            <surname>Loshchilov</surname>
          </string-name>
          and
          <string-name>
            <given-names>F.</given-names>
            <surname>Hutter</surname>
          </string-name>
          . “
          <article-title>Decoupled Weight Decay Regularization”</article-title>
          .
          <source>IInnt:ernational Conference on Learning Representations</source>
          .
          <year>2017</year>
          . url: https://api.semanticscholar.org/Corp usID:
          <fpage>53592270</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>G.</given-names>
            <surname>Lukács. “Der Historische Roman</surname>
          </string-name>
          .
          <year>1937</year>
          ”. InB:erlin: Aufbau-Verlag (
          <year>1955</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lukes</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Søgaard</surname>
          </string-name>
          . “
          <article-title>Sentiment analysis under temporal shift”</article-title>
          .
          <source>InP: roceedings of the 9th Workshop on Computational Approaches</source>
          to Subjectivity,
          <article-title>Sentiment and Social Media Analysis</article-title>
          . Ed. by
          <string-name>
            <given-names>A.</given-names>
            <surname>Balahur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Mohammad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Hoste</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Klinger</surname>
          </string-name>
          . Brussels, Belgium: Association for Computational Linguistics,
          <year>2018</year>
          , pp.
          <fpage>65</fpage>
          -
          <lpage>71</lpage>
          .
          <year>do1i0</year>
          :.
          <volume>18653</volume>
          /v1/
          <fpage>W18</fpage>
          -6210. url: https://aclanthology.org/W18-621.
          <fpage>0</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>B.</given-names>
            <surname>Min</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Ross</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Sulem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. P. B.</given-names>
            <surname>Veyseh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. H.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Sainz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Agirre</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Heintz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Roth</surname>
          </string-name>
          . “
          <article-title>Recent advances in natural language processing via large pre-trained language models: A survey”</article-title>
          .
          <source>In:ACM Computing Surveys 56.2</source>
          (
          <issue>2023</issue>
          ), pp.
          <fpage>1</fpage>
          -
          <lpage>40</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>F.</given-names>
            <surname>Moretti</surname>
          </string-name>
          . “Style, Inc.
          <source>Reflections on Seven Thousand Titles (British Novels</source>
          ,
          <volume>1740</volume>
          ?
          <year>1850</year>
          )
          <article-title>”</article-title>
          .
          <source>In: Critical Inquiry 36.1</source>
          (
          <issue>2009</issue>
          ), pp.
          <fpage>134</fpage>
          -
          <lpage>158</lpage>
          . doi:
          <volume>10</volume>
          .1086/606125.
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>R.</given-names>
            <surname>Mucignat</surname>
          </string-name>
          . “Fredric Jameson.
          <source>The Antinomies of Realism. London: Verso</source>
          ,
          <year>2013</year>
          , 326 pp.”
          <source>In: Orbis Litterarum 71.5</source>
          (
          <issue>2016</issue>
          ), pp.
          <fpage>430</fpage>
          -
          <lpage>431</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>E.</given-names>
            <surname>Munch-Petersen</surname>
          </string-name>
          .
          <article-title>“Romanens århundrede: studier i den masselaeste oversatte roman i Danmark 1800-1870”</article-title>
          . In: (No Title)
          <article-title>(</article-title>
          <year>1978</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>D. S.</given-names>
            <surname>Nielsen</surname>
          </string-name>
          . “
          <article-title>Scandeval: A benchmark for Scandinavian natural language processing”</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          <source>In: arXiv preprint arXiv:2304.00906</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Nolazco-Flores</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. V.</given-names>
            <surname>Guerrero-Galván</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Del-Valle-Soto</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L. P.</given-names>
            <surname>Garcia-Perera</surname>
          </string-name>
          .
          <article-title>“Genre Classification of Books on Spanish”</article-title>
          .
          <source>In:IEEE Access</source>
          <volume>11</volume>
          (
          <year>2023</year>
          ), pp.
          <fpage>132878</fpage>
          -
          <lpage>132892</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [33] [34]
          <string-name>
            <given-names>N. D.</given-names>
            <surname>Paige</surname>
          </string-name>
          .
          <article-title>Technologies of the Novel: Quantitative Data and the Evolution of Literary Systems</article-title>
          . Cambridge University Press,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          <year>2019</year>
          , pp.
          <fpage>81</fpage>
          -
          <lpage>89</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>A.</given-names>
            <surname>Radford</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Child</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Luan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Amodei</surname>
          </string-name>
          ,
          <string-name>
            <surname>and I. Sutskever.</surname>
          </string-name>
          “
          <article-title>Language Models are Unsupervised Multitask Learners”</article-title>
          . In: (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Radway</surname>
          </string-name>
          .
          <article-title>Reading the romance: Women, patriarchy, and popular literature</article-title>
          . Univ of North Carolina Press,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>G.</given-names>
            <surname>Rakshit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ghosh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bhattacharyya</surname>
          </string-name>
          , and
          <string-name>
            <surname>G. Hafari.</surname>
          </string-name>
          “
          <article-title>Automated analysis of Bangla poetry for classification and poet identification”</article-title>
          .
          <source>In: Proceedings of the 12th international conference on natural language processing</source>
          .
          <year>2015</year>
          , pp.
          <fpage>247</fpage>
          -
          <lpage>253</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>A.</given-names>
            <surname>Schneider</surname>
          </string-name>
          and
          <string-name>
            <given-names>P. R.</given-names>
            <surname>Fabo</surname>
          </string-name>
          . “
          <article-title>Stage Direction Classification in French Theater: Transfer Learning Experiments”</article-title>
          .
          <source>In:Proceedings of the 8th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage</source>
          , Social Sciences,
          <article-title>Humanities and Literature (LaTeCH-CLfL</article-title>
          <year>2024</year>
          ).
          <year>2024</year>
          , pp.
          <fpage>278</fpage>
          -
          <lpage>286</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>A.</given-names>
            <surname>Sharmaa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Shang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Singhal</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Underwood</surname>
          </string-name>
          . “
          <article-title>The rise and fall of genre diferentiation in English-language fiction”</article-title>
          .
          <source>In: DH2020 (ADHO) Proceedings</source>
          <volume>1613</volume>
          (
          <year>2020</year>
          ), p.
          <fpage>0073</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [40]
          <string-name>
            <surname>H. E. Shaw.</surname>
          </string-name>
          <article-title>The forms of historical fiction: Sir Walter Scott and his successors</article-title>
          . Cornell University Press,
          <year>1983</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>V.</given-names>
            <surname>Snaebjarnarson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Simonsen</surname>
          </string-name>
          , G. Glavaš,
          <string-name>
            <surname>and I. Vulić.</surname>
          </string-name>
          “
          <article-title>Transfer to a Low-Resource Language via Close Relatives: The Case Study on Faroese”</article-title>
          .
          <source>InP:roceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)</source>
          . Tórshavn, Faroe Islands: Linköping University Electronic Press, Sweden,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          [42]
          <string-name>
            <given-names>L.</given-names>
            <surname>Søndergaard</surname>
          </string-name>
          . “
          <article-title>At fortaelle historier om historien: Om den historiske roman i relation til Poul Vads Rubruk (</article-title>
          <year>1972</year>
          )
          <article-title>og Ib Michaels Troubadurens laerling (</article-title>
          <year>1983</year>
          )
          <article-title>F”</article-title>
          .
          <source>oIrntae:llingen i Norden efter 1960 . Aalborg Universitetsforlag</source>
          ,
          <year>2004</year>
          , pp.
          <fpage>404</fpage>
          -
          <lpage>412</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          [43]
          <string-name>
            <given-names>L.</given-names>
            <surname>Strømberg-Derczynski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ciosici</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Baglini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. H.</given-names>
            <surname>Christiansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Dalsgaard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fusaroli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. J.</given-names>
            <surname>Henrichsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hvingelby</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kirkedal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Kjeldsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ladefoged</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Å. Nielsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Madsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Petersen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Rystrøm</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Varab</surname>
          </string-name>
          . “
          <article-title>The Danish Gigaword Corpus”</article-title>
          .
          <source>InP:roceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa)</source>
          . Reykjavik, Iceland (Online): Linköping University Electronic Press, Sweden,
          <year>2021</year>
          , pp.
          <fpage>413</fpage>
          -
          <lpage>421</lpage>
          . url: https://aclanthology.org/
          <year>2021</year>
          .nodalida-main.
          <fpage>46</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          [44]
          <string-name>
            <given-names>T.</given-names>
            <surname>Underwood</surname>
          </string-name>
          . “
          <article-title>The life cycles of genres”</article-title>
          . In: (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          [45]
          <string-name>
            <given-names>T.</given-names>
            <surname>Underwood</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Bamman</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Lee</surname>
          </string-name>
          . “
          <article-title>The transformation of gender in Englishlanguage fiction”</article-title>
          . In: (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref47">
        <mixed-citation>
          [46]
          <string-name>
            <given-names>M.</given-names>
            <surname>Wilkens</surname>
          </string-name>
          . “Genre, computation, and
          <article-title>the varieties of twentieth-century US fiction”</article-title>
          .
          <source>In: Journal of Cultural Analytics 2.2</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref48">
        <mixed-citation>
          [47]
          <string-name>
            <given-names>M.</given-names>
            <surname>Winge</surname>
          </string-name>
          .
          <article-title>Fortiden som spejl</article-title>
          .
          <source>Lindhardt og Ringhof</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>