<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Animacy in German Folktales</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>JulianHäußler</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Janis von Keitz</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Evelyn Gius</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>fortext lab, Technical University of Darmstadt</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <fpage>1023</fpage>
      <lpage>1036</lpage>
      <abstract>
        <p>This paper explores the phenomenon of animacy in prose by the example of German folktales. We present a manually annotated corpus of 19 German folktales from the Brothers Grimm collection and train a classifier on these annotations. Building on previous work in animacy detection, we evaluate the classifier's performance and its application to a larger corpus. The findings highlight the complexity of animacy in literary texts, distinguishing it from named entity recognition and emphasizing the classifier's potential for enhancing character recognition in narratives.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;animacy</kwd>
        <kwd>animacy classification</kwd>
        <kwd>folktales</kwd>
        <kwd>Computational Literary Studies</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>1. Introduction
[A]nd when any one attacked him he
would say, “Stick, out of the sack!” and
directly out jumped the stick, and dealt a
shower of blows on the coat or jerkin,
and the back beneath, which quickly
ended the afair.</p>
      <p>The goal of this paper is to scrutinize animacy in German folktales as a phenomenon of
interest for Computational Literary Studies (CLS) by showcasing the manual annotation of
animacy and presenting a classifier trained on these annotations. The overall approach builds
on the work of Karsdorp et al.9][ who developed an approach to animacy detection in Dutch
folktales. In the following we discuss previous work on animasceyct(ion 2), present our corpus
of German folktales from thCehildren’s and Household Tales by the brothers Grimm as well as
our understanding of animacy and its manual annotatiosnec(tion 3). Furthermore, we evaluate
its relation to the neighboring concepts of fictional characters and named entity recognition
(section 4). We then reproduce Karsdorp et al.’s approach for German folktales, evaluate the
results of our classification and apply them to a larger corpusesc(tion 5). We close summarizing
our findings and sketching possible directions for future worskec(tion 6).</p>
    </sec>
    <sec id="sec-2">
      <title>2. Animacy in Text-based Research</title>
      <p>
        The concept of animacy is crucial in human perception for distinguishing between living and
non-living entities. Animacy perception, the ability of which is developed in early childhood
and might be innate [
        <xref ref-type="bibr" rid="ref10">9</xref>
        ], is based on unpredictability of biological life but also on agency,
inlfuenced by movements and mental states [ 20]. This perception extends to reading texts. In
ifction, the recognition of animacy is influenced by the narrative context, allowing fictional
entities to be perceived as animate even if they aren’t in real lif1e2[].
      </p>
      <p>In text based research, animacy has been introduced as a grammatical category by Michael
Silverstein in the 1970s. He suggested a hierarchy for describing languages on a general level,
ranking grammatical phenomena according to animacy: 1st person &gt; 2nd person &gt; 3rd
person/deictics &gt; human NPs &gt; animate NPs &gt; inanimate NPs 4[]. This hierarchy influences
grammatical structures in many languages, afecting aspects such as inflection and word order. For
example, in German as well as in English and other languages, animacy influences the choice of
interrogative pronouns and the use of certain verbs (e.g., “schauen” /“to look” requires animacy
of the semantic subject). Despite this systematic hierarchy, animacy isn’t a simple linear scale.
It is influenced by additional parameters, including the perception of empathy and sensation.
Objects like computers or organizations are sometimes considered animate due to attributed
intelligence or agency, complicating the distinction between animate and inanimat2e1][.</p>
      <p>
        In literary studies, a definitive concept of animacy has yet to be established. Still, it can be
explored through stylistic devices like personification and anthropomorphism, as well as through
the characterization of fictional characters. Personification assigns human attributes to
nonhuman entities, while anthropomorphism extends this by giving them human-like forms and
mental attributes. Additionally, narratological theories examine how characters, including
potentially inanimate ones, are constructed and perceived, highlighting the complexity of
character portrayal through textual indicators, language use, and readers’ cognitive engagement
[
        <xref ref-type="bibr" rid="ref9">8</xref>
        ].
      </p>
      <p>In NLP and adjacent fields, the binary classification of entities in texts into animate and
inanimate entities is particularly relevant. Animacy classification aids numerous NLP tasks such as
anaphora or coreference resolution, dependency parsing, word sense disambiguation, semantic
role labeling, as well as automatic text generation and translati7o, n9][. Determining whether
a pronoun refers to an animate or inanimate antecedent significantly simplifies anaphora and
coreference resolution in many languages. Since animacy also influences grammatical
structures in many languages, it also afects dependency parsing and semantic role labeling. In
automatic text generation, taking into account the animacy required by verbs is essential for
generating semantically correct sentences.</p>
      <p>
        In Computational Literary Studies, animacy classification can help identify characters in
narratives [
        <xref ref-type="bibr" rid="ref8">7, 16</xref>
        ]. However, fictional worlds in literature can challenge traditional animacy
classification, as objects or plants may act as agents, diverging from real-world knowledge.
Rulebased systems with semantic lexicons like WordNet might misclassify such entities. Therefore,
animacy classification in narrative texts should build on contextual understanding rather than
ifxed rules [
        <xref ref-type="bibr" rid="ref10">9</xref>
        ]. Hybrid systems combining machine learning with rule-based methods show
promise in addressing these challenges7.][ used a hybrid system combining a support vector
machine classifier with a rule based classification system and achieved an  1 of 0.88 for
classifying animacy. [
        <xref ref-type="bibr" rid="ref10">9</xref>
        ] tried using a, as they call it, linguistically uninformed model with word
embeddings and achieved an 1 of 0.91 for the animate class.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Data</title>
      <sec id="sec-3-1">
        <title>3.1. Corpus</title>
        <p>Our approach is based on a corpus of 19 German folktales (sAeeppendix A). These were
selected from the Brother GrimmC’shildren’s and Household Tales (Kinder- und Hausmärchen),
a collection of folktales published from 1812 onwar2d,s3[]. The texts were collected from
Wikisource, where all editions of the collection are available digitally.</p>
        <p>For selecting texts we reviewed all 201 tales and 10 children’s legends for entities that are
depicted as animate but cannot be categorized as humans or animals in everyday terms. We
excluded the cases difering significantly in meaning and function from inanimate entities that
are animate and have a tangible counterpart in the real world. Meaning, texts containing
supernatural phenomena, such as the anthropomorphization of divine beings, the personification
of events like death, or metaphorical descriptions in which animacy is used as a stylistic were
not included. Moreover, humans or animals transformed into inanimate entities within the
ifctional world were not considered if they exclusively displayed inanimate qualities. Magical
entities were examined case-by-case, as these often represent borderline cases of animacy
depiction. Since depiction of animacy is strongly related to independent action, texts where this
action is explicitly described for magical entities were inclu1ded.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Manual Animacy Annotations</title>
        <p>
          We annotated the 19 folktales in our corpus with regard to animacy. The full annotation
guideline is available inAppendix B. Our animacy concept is connected to coreference annotation,
as we not only annotate (proper) nouns but also mentions of animacy. However, our approach
1Our approach therefore difers from the principles in the Aarne–Thompson–Uther Inde1x9][and the Motif-Index
of Folk-Literature1[
          <xref ref-type="bibr" rid="ref9">8</xref>
          ]. The ATU classifies animals tales but disregards animacy and the Motif-Index bases the
classification of animals with human traits only on speech or role, but not on agency.
difers from the one by [
          <xref ref-type="bibr" rid="ref8">7</xref>
          ] in the way that they use pre-annotated coreference chains,
annotating animacy in nouns, gendered pronouns and adjectives. It also difers from9][ as their
animacy concept is based on the rationality and intentionality of an entity, whereas we base
our animacy understanding on agency and speech. However, lik9e][we use untagged data.
        </p>
        <p>We consider an entity animate if one of the three conditions is met:
1. The entity performs an action independently and fulfills the agent role of a verb.
2. The entity makes independent verbal utterances.
3. The entity is described by a lexeme that refers to a living being, irrespective of its role
or actions in the sentence. Unless, an additional description explicitly excludes animacy
(e.g., a dead relative).</p>
        <p>In order to have an overview of entities in the text and to be able to relate to each entity,
for every animate entity one mention was annotated arsecognizable mention (rm) in the first
iteration of annotation. In the second iteration all other expressions referring to animate
entities were marked as animate. The referring expressions include proper names, descriptions by
attributes (such as profession, gender, appearance, or social status), and pronouns and can be
single or multiple token occurrences.</p>
        <p>A second annotator has annotated KHM 6 and 10, resulting in an average Cohen’s kappa of
0.87.2 Disagreement stems mostly from the second annotator tending to oversee several articles
as well as possessive and reflexive pronouns, while also tending to annotate shorter spans (e.g.
only “goldsmiths” instead of “the goldsmiths of the empire”). However the first annotator
also overlooked several personal pronouns, we decided therefore to make the annotation of all
relevant pronouns more explicit in the guidelines.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Animacy and Related Concepts</title>
      <sec id="sec-4-1">
        <title>4.1. Animacy and Literary Characters</title>
        <p>In order to investigate the relationship between animate entities and characters we performed
an additional annotation of characters (cf. the third iteration in the guidelineAs pinpendix B).
Additionally we further categorized the entities according to their degree of animacy, ranking
from human and animal to inanimate, and supernatural (cTfa.ble 1). A closer look at the data
shows that animated entities appear more frequently as characters in fairy tales, with humans
making up more than the half of the characters and often serving as protagonists even in our
selection of folktales which is skewed towards non-human animate entities. Animals are
portrayed as characters when they are humanized, transformed into humans, or perform certain
functions, while animals that are not characters are often tamed, play a secondary role in
animal stories, or serve a single function. While inanimate objects appear less frequently, they
often become characters when animated for narrative purposes, emphasizing the intentional
use of animated inanimate objects in the stories.
2In order to calculate the inter-annotator-agreement we assigned animate/inanimate tags to each token, splitting
multiword expressions. If one annotator annotated “trusty John” and the other annotator annotated only “John”,
the name gets counted as a match while the adjective doesn’t.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Animacy and Named Entities</title>
        <p>We now look into named entity recognition which is currently the default approach to character
analysis in CLS. The analysis of the results of NER with Stanza14[] and our manual animacy
annotation reveals a disparity between entities recognized by NER and those annotated as
animate, with only 193 tokens overlapping (cfT.able 2). In terms of distribution, 106 tokens
are exclusively named entities, mostly involving mere mentions of names without action, while
5,588 cases are exclusively animate.</p>
        <p>The scarcity of entity annotations for animate entities can primarily be attributed to the
NER approach in which pronouns and articles are not considered entities. But there are also
errors in the NER in which some proper names and appellatives where not identified properly
as named entities.</p>
        <p>Among the identified named entity types are 173 animate and 106 inanimate PER tokens,
and 20 animate (and 0 inanimate) LOC tokens. Entities that were are annotated as animate and
as named entities include diminutive forms, professions, kinship terms, and celestial bodies
like “Sonne” (sun) and “Mond” (moon).</p>
        <p>Next to these correctly identified cases there are several missed mentions. The cases already
mentioned as well as other animacy mentions are only inconsistently recognized as named
entities for recurrent occurrence. For instance, “Besenchen,” (diminutive of broom) “Bohne,” (bean)
and “Drechsler” (wood turner) are recognized as S-PER (single-token person entities) only in
some of their occurrences within the same text, while others like “Gänsemagd” (goose maid)
and “Fuchs” (fox) show varied recognition across diferent texts. Instances of “Berg Semsi”
(semsi mountain) are consistently annotated as animate but only recognized four times as LOC
and one time as PER out of ten cases. Also, unique tokens such as “Söhnlein” (diminutive of
son) and archaic forms like “Thier” (animal) are noted for their inconsistent recognition.</p>
        <p>The observation that named entity recognition (NER) does not fully encompass animacy
detection suggests that, even when disregarding NER erro3r, sanimacy is a more efective
criterion for character detection (cf. animacy scores Tinable 3).
3We additionally annotated PER entities in six folktales (KHM 6, 10, 11, 18, 24, 28). Stanza NER classification only
reached on F1 score of 0.7 (P: 0.63, R: 0.78) for these which is a considerably worse performance than animacy
detection.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Animacy Classification</title>
      <sec id="sec-5-1">
        <title>5.1. Implementation of the Classifier</title>
        <p>
          In examining their annotated data, Karsdorp et al. observe that the part of speech of a word is
already a sort of ’weak’ indicator for a word to be animate, as 40% of tokens they annotated as
animate are nouns or proper nouns, while only 11% of tokens tagged as inanimate are nouns
[
          <xref ref-type="bibr" rid="ref10">9</xref>
          ]. A finding we can confirm at least in part, as 26.5% of our tokens annotated as animate
are nouns or proper nouns and 5.7% of tokens tagged as inanimate are nouns or proper nouns.
[
          <xref ref-type="bibr" rid="ref10">9</xref>
          ] build on this observation by not only training their classifier on the manually annotated
data but also adding various linguistic features in order to find a best performing combination
for the training input. They run several experiments where they always include the manually
annotated tokens in a rolling context window of three token to the left and right (which they call
the lexical input). They subsequently combine this base data with the rolling context window
of the lemma, the part-of-speech tags (i.e. morphological features), the dependency tags (i.e.
syntactic features) and the embedding vector of the target token taken from a Word2Vec model
built on a web corpus (i.e. semantic features). We reproduced their way of creating lexical,
morphological and syntactic features using the Stanza librar1y4][. We furthermore trained
a Word2Vec model which we deem comparable to the literary language of the time, trained
on 115 novels from the German Romantic era17[]. This Word2Vec model was trained using
Gensim [
          <xref ref-type="bibr" rid="ref19">15</xref>
          ], which is based on [
          <xref ref-type="bibr" rid="ref12">11</xref>
          ]. We used the same parameters as 9[], who use the
skipgram architecture with a vector size of 300 (the other parameters were set to defau4lt).
        </p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Evaluation</title>
        <p>For evaluating the results we calculated the F1-scores using 10-fold cross validation,
diferentiating between the much larger class of inanimate and the class of animate entities (cTfa.ble 3).
While we reached lower F1-scores, our results are comparable to the result9s] owfi[th
regard to the combination of lexical features (tokens), part-of-speech tags and embedding vector
yielding the best result (F1-score of the Dutch classifier for the animate class of 0.93).</p>
        <p>
          Furthermore, we experimented with adding more annotated data to see if the performance
of the classifier plateaus at a certain point. For this, we annotated six additional KHM folktales.
Subsequently, we incrementally expanded the data for the classifier with all features by adding
one of these fairy tales at a time and conducted a 10-fold cross-validation to observe the
evolu4With this we have successfully reproduced the workflow by9[] concerning lexical and the combination of lexical
and semantic features. However, we have not yet determined how to incorporate additional features into this
training process. We used the same classification algorithm (Maximum Entropy, as implemented in scikit-learn,
[
          <xref ref-type="bibr" rid="ref15">13</xref>
          ]).
tion of the F1 score of the animate class. It was found that the score did not increase; rather, it
tended to decrease slightly (cf4.).
        </p>
      </sec>
      <sec id="sec-5-3">
        <title>5.3. Implementation in German Folktales</title>
        <p>The application of our the classifier to the entire corpus of 21C1hildren’s and Household Tales
yields the results shown inFigure 1. The average proportion of animated tokens is 16%. The
20 texts with a proportion of &lt;=10% (bottom outliers) consistently are not written in standard
German.</p>
        <p>A spot-check of the annotations indicates reasonably good results. The classifier even
discerns correctly between mere name references in direct speech and mentions of animate
entities. For example, the main character of the eponymous tale KHM 5R5umpelstiltskin is
annotated as animate when referred to as “Männchen” (little man), whereas the two proper name
mentions used only as a name reference in direct speech are classified correctly as inanimate. In
KHM 34 Clever Elsie the character is not classified as animate in direct speech but is recognized
as such in narrative parts and in KHM 166Strong Hans the character “Hans” is consistently
recognized correctly as animate. However, the classifier struggles with correctly detecting rare
and complex tokens. For example, the more common “Fuchs” (fox) is recognized more reliably
than the more complex form “Rothfuchs” (red fox) in KHM 7T3he Wolf and the Fox. On the
other hand, the animate “Vogel” (bird) in contrast to the inanimate “Vogelherz” (bird heart) in
KHM 122 Donkey Cabbages are classified both correctly.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Discussion and Outlook</title>
      <p>Our approach to animacy classification achieves a reasonably good detection of animacy and its
application to a corpus of German folktales provides some interesting insights. With regard to
the assumed relation between named entities and animacy, we have shown only partial overlap
both for the concepts and for their detection. In other words, our results indicate that we did
not simply achieve NER in place of animacy detection. These outcomes demonstrate that our
animacy approach is distinct from NER. In fact, correlation between the relative frequency of
person entities (automatically tagged using14[]) and the relative frequency of animate entities
(using our classifier) in the Grimm corpus is rather low, with a Pearson correlation coefÏcient
of -0.132 and Spearman’s correlation coefÏcient of -0.195. The classifier adheres to our animacy
framework both with regard to animacy and inanimacy. Furthermore, the classifier addresses
a gap in detecting animated animals and objects. This capability suggests that with further
development, it could also enhance character recognition.</p>
      <p>Accordingly, future work should explore the combination with NER and coreference
resolution for the identification of characters as well as the potential of LLMs for the annotation of
animacy. From the perspective of analysis, also sorting out human entities would be an
interesting future step, allowing to analyze animate objects, animals, and other potential candidates
for characters not displaying features of person entities.
[16]
[17]
[19]
[20]
[21]</p>
      <p>H.-J. [ Uther. The types of international folktales: a classification and bibliography; based
on the system of Antti Aarne and Stith Thompson. FF Communications. Helsinki:
Suomalainen Tiedeakatemia, 2004.</p>
      <p>M. Westfall. “Perceiving agency”. In:Mind &amp; Language 38.3 (2023), pp. 847–865. doi:
10.1111/mila.12399.</p>
      <p>M. Yamamoto. Animacy and reference: A cognitive approach to corpus linguistics. Vol. Vol.
46. Studies in language Companion series, SLCS. Amsterdam and Philadelphia, PA: John
Benjamins Publ, 1999. doi:10.1075/slcs.46.
A. Corpus</p>
    </sec>
    <sec id="sec-7">
      <title>B. Annotation Guidelines</title>
      <p>The annotation process is designed for using the CATMA platform and proceeds through three
iterations. It is based on the understanding of mentions from coreference chains. The
annotation span ranges from single tokens to multi word expressions.</p>
      <sec id="sec-7-1">
        <title>Iteration 1: Overview of Entities</title>
        <p>In the first iteration, the goal is to provide an overview of all animate entities and to be able to
relate to each entity. Each entity recognized as animate is marked with a clearly identifiable
mention in the text that we callrecognizable mention (rm). The mention does not necessarily
need to be the first occurrence of the entity; rather, it should be one that allows for quick
identification. An entity qualifies as animate if at least one of the first three criteria is met:
1. The entity performs an independent action explicitly described in the text, occupying
the agent role of a verb.
2. The entity makes independent verbal expressions.
3. The entity is described by a lexeme that refers to a living being, irrespective of its role
or actions in the sentence. Unless, an additional description explicitly excludes animacy
(e.g., a dead relative).</p>
        <p>
          For example in the sentence “directly jumped out the stick, and dealt a shower of blows on
the coat or jerkin, and the back beneath, which quickly ended the afair” (KHM 36The Table, the
Ass and the Stick, [
          <xref ref-type="bibr" rid="ref7">6</xref>
          ]) the stick’s agent role is evident. Therefore, it meets the first criterion and
is annotated as animate. Whereas in the sentence “When placed and spoken to, ‘Little table,
set yourself,’ it would immediately be covered with a clean cloth, with plates, knives, and forks
beside it” (KHM 36The Table, the Ass and the Stick) an independent action is implied, although
it is not explicitly depicted. The little table does not occupy an agent role and is therefore not
marked as animate.
        </p>
        <p>As an example for the second criterion we look at the sentence “but the bread called out,
‘Oh, take me out, take me out, or I’ll burn; I’ve been done for a long time.’” (KHMM24other
Holle). The bread is the originator of an independent verbal statement and is hence marked as
animate.</p>
        <p>The third criterion can be observed in the description “After lifting the girl onto his horse,
the old woman showed him the way” (KHM 49The Six Swans). Although the horse is not in an
agent position here, readers’ world knowledge recognizes a horse as an animate entity, so it is
marked as animate. This extends to entities that are not characters, such as relatives mentioned
but not directly appearing in the text, which are also annotated.</p>
        <p>Furthermore, a new recognizable mention gets annotated for entities that are transformed
radically, where the transformed entity also satisfies one of the conditions explained above. E.g.
in KHM6 Trusty John the title character gets transformed into a speaking stone.</p>
        <p>If multiple entities get introduced as a group (“three ravens”), the first mention of the group
gets annotated instead of single first mentions of each member of the group.</p>
        <p>To further clarify the rules, some borderline cases are discussed in the following. In some
fairy tales, the narrator appears through a first-person reference and the reader is also
referenced.</p>
        <p>• “eagle and finch, owl and crow, lark and sparrow, what should I call them all?”
(KHM 171 The Wren)
• “and the donkey didn’t stop until everyone had so much that they couldn’t carry anymore.
(I can see it in your face, you would have liked to be there too.)”
(KHM 36 The Table, the Ass, and the Stick)</p>
        <p>The narrator and the recipient are regarded here as textual constructs with no real-world
counterpart 1[0, p. 61]. As a result, their reference expressions cannot be assigned a definite
degree of animacy.</p>
        <p>Common borderline cases are magical objects that appear in fairy tales. Rule (1) has already
clarified that explicit independent action is a prerequisite for the animacy annotation. However,
cases occur where classification is still ambiguous.</p>
        <p>• “the way was so hard to find that he would not have found it if a wise woman had not
given him a ball of yarn; when he threw it in front of him, it unwound by itself and
showed him the way.”
(KHM 49 The Six Swans)
“Now she could not rest until she found out where the king kept the ball of yarn”
(KHM 49 The Six Swans)</p>
        <p>The ball of yarn in the first excerpt clearly occupies the agent role of the verbs “unwind” and
“show.” In the second quote, however, it is used with the verb “keep,” which typically requires
an inanimate object. Based on this case, it was decided that a single animate occurrence is
sufÏcient for marking the entity as animate.</p>
      </sec>
      <sec id="sec-7-2">
        <title>Iteration 2: Annotation of all Mentions</title>
        <p>In the second iteration, all reference expressions referring to entities marked as animate in the
previous iteration were annotated. Reference expressions include all noun phrases containing
proper names including the article, descriptors based on attributes such as occupation (“the
brave little tailor”), gender, appearance (“the beautiful one”), or social status (“the poor man,”
“the princess”), as well as personal, demonstrative, relative, possessive, or indefinite pronouns.
Additionally, all expressions referring to such noun phrases through any reference type were
included. This annotation level provides an overview of where and how often animate entities
appear. Borderline cases in the annotation process include vague references (“everyman”),
relfexive verb constructions (“he withdrew himsel”f), or entities recognized as animate only later
in the story. Vague references and reflexive pronomina in reflexive verb constructions are not
annotated animate because they do not refer to any specific animate entity. Entities that
appear as inanimate but can be recognized animate over the course of the story are consistently
marked as animate.</p>
      </sec>
      <sec id="sec-7-3">
        <title>Iteration 3: Annotation of Character Status and Animacy Degree</title>
        <p>In the third and final iteration, the existing annotations are enriched with the properties
’character’ and ’degree of animacy’. The former indicates if the animate entity is a character or not.
The latter marks it as human, animal, object or supernatural.</p>
        <p>An entity is marked as a character if the description in the text includes some form of the
semantic feature “human” (“the king”). Another indicator is the association with verbs
describing typically human actions. Entities that are sources of verbal expressions (“the lion spoke”)
or exhibit a complex inner life or thinking are also marked. Some borderline cases include
groups of people.</p>
        <p>• “The king summoned all goldsmiths, who had to work day and night”</p>
        <p>(KHM 6 Trusty John)
• “Then the other servants of the king, who did not favor Faithful John, shouted, ‘How
shameful to kill the beautiful animal that was to carry the king to his castle!’ ”
(KHM 6 Trusty John)</p>
        <p>Individual cases must be distinguished. The term “goldsmiths” can be semantically
associated with the occupation of a human, but no individuals are visible in this description, so they
do not appear as characters. The servants, even collectively, show a complex inner life through
their mistrust and are therefore considered characters.</p>
        <p>The second property indicates the degree of animacy of a corresponding entity in the real
world. Animate entities can be annotated with the values “human,” “animal,” “supernatural,”
or “inanimate.” This distinction between perception within the fictional world and the world
knowledge applied during reception is important. Although entities in narratives do not form
a direct reference to the real world due to their fictional nature, readers derive many features
from their world knowledge about the corresponding real-world entity. A borderline case in
this categorization is the description of body parts. The head of a horse (cf. KHMT8h9e Goose
Girl) could be classified as animal and inanimate. Here, it is argued that the category animal
implies a form of animacy. Parts of a dead animal would be perceived as inanimate in everyday
life and are therefore categorized as such here.</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>C. Online Resources</title>
      <p>Data and code can be found here: https://github.com/forTEXT/animacy_in_german_folktales.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R.</given-names>
            <surname>Borgards</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Middelhof</surname>
          </string-name>
          , and B. Thums, eds.
          <source>Romantische Ökologien: Vielfältige Naturen um 1800</source>
          . Vol.
          <volume>4</volume>
          .
          <string-name>
            <given-names>Neue</given-names>
            <surname>Romantikforschung</surname>
          </string-name>
          . Berlin, Heidelberg: Springer,
          <year>2023</year>
          .
          <source>d1o0i.1:0</source>
          <volume>07</volume>
          /978-3-
          <fpage>662</fpage>
          -67186-3.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Brüder</given-names>
            <surname>GrimmK</surname>
          </string-name>
          .
          <source>inder und Hausmärchen: Band</source>
          <volume>1</volume>
          . 7th ed. Göttingen: Verlag der Dieterichschen Buchhandlung,
          <year>1857</year>
          . url:https://de.wikisource.org/wiki/Kinder-%
          <year>5C</year>
          %
          <article-title>5Fund%5 C%5FHaus-M%5C%C3%5C%A4rchen%5C%5FBand%5C%5F1%5C%5F(</article-title>
          <year>1857</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Brüder</given-names>
            <surname>GrimmK</surname>
          </string-name>
          .
          <source>inder und Hausmärchen: Band</source>
          <volume>2</volume>
          . 7th ed. Göttingen: Verlag der Dieterichschen Buchhandlung,
          <year>1857</year>
          . url:https://de.wikisource.org/wiki/Kinder-%
          <year>5C</year>
          %
          <article-title>5Fund%5 C%5FHaus-M%5C%C3%5C%A4rchen%5C%5FBand%5C%5F2%5C%5F(</article-title>
          <year>1857</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>H.</given-names>
            <surname>Bußmann</surname>
          </string-name>
          , ed.
          <source>Lexikon der Sprachwissenschaft</source>
          . 4th ed. Stuttgart: Alfred Kröner Verlag,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>M. Coll Ardanuy</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Nanni</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Beelen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Hosseini</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Ahnert</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Lawrence</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>McDonough</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Tolfo</surname>
            ,
            <given-names>D. C.</given-names>
          </string-name>
          <string-name>
            <surname>Wilson</surname>
            , and
            <given-names>B. McGillivray.</given-names>
          </string-name>
          “
          <article-title>Living Machines: A study of atypical animacy”</article-title>
          .
          <source>In: Proceedings of the 28th International Conference on Computational Linguistics.</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          Ed. by
          <string-name>
            <given-names>D.</given-names>
            <surname>Scott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Bel</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Zong</surname>
          </string-name>
          . Barcelona,
          <string-name>
            <surname>Spain</surname>
          </string-name>
          (Online):
          <source>International Committee on Computational Linguistics</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>4534</fpage>
          -
          <lpage>4545</lpage>
          .
          <year>doi1</year>
          :
          <fpage>0</fpage>
          .18653/v1/
          <year>2020</year>
          .coling-main .
          <volume>400</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <article-title>[6] “The Table, the Ass, and the Stick”</article-title>
          . In: Household Stories,
          <article-title>illustrated by Walter Crane, translated by Lucy Crane</article-title>
          . Ed. by
          <string-name>
            <given-names>J.</given-names>
            <surname>Grimm</surname>
          </string-name>
          and
          <string-name>
            <given-names>W.</given-names>
            <surname>Grimm</surname>
          </string-name>
          . Trans. by
          <string-name>
            <given-names>L.</given-names>
            <surname>Crane</surname>
          </string-name>
          .
          <year>1882</year>
          . url: https://en.wikisource.
          <source>org/wiki/Household%5C%5Fstories%5C%5Ffrom%5C%5Fthe%5 C%5Fcollection%5C%5Fof%5C%5Fthe%5C%5FBros%5C%5FGrimm%5C%5F(L%5C%5F%5 C%26%5C%5FW%5C%5FCrane)/The%5C%5FTable,%5C%5Fthe%5C%5FAss,%5C%5Fand %5C%5Fthe%5C%5FStick.</source>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>L.</given-names>
            <surname>Jahan</surname>
          </string-name>
          , G. Chauhan, and
          <string-name>
            <given-names>M.</given-names>
            <surname>Finlayson</surname>
          </string-name>
          .
          <article-title>“A New Approach to Animacy Detection”</article-title>
          .
          <source>In: Proceedings of the 27th International Conference on Computational Linguistics</source>
          . Ed. by
          <string-name>
            <given-names>E. M.</given-names>
            <surname>Bender</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Derczynski</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Isabelle. Santa Fe</surname>
          </string-name>
          , New Mexico, USA: Association for Computational Linguistics,
          <year>2018</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>F.</given-names>
            <surname>Jannidis</surname>
          </string-name>
          .Figur und Person.
          <source>Beitrag zu einer historischen Narratologie</source>
          . Berlin: de Gruyter,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>F.</given-names>
            <surname>Karsdorp</surname>
          </string-name>
          , M. van der Meulen, T. Meder,
          <article-title>and</article-title>
          <string-name>
            <surname>A. van den Bosch.</surname>
          </string-name>
          “
          <article-title>Animacy Detection in Stories”</article-title>
          .
          <source>In:6th Workshop on Computational Models of Narrative (CMN</source>
          <year>2015</year>
          ). Ed. by
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Finlayson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lieto</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Ronfard</surname>
          </string-name>
          . Vol.
          <volume>45</volume>
          . Open Access Series in Informatics (OASIcs). Dagstuhl, Germany: Schloss Dagstuhl - Leibniz-Zentrum für Informatik,
          <year>2015</year>
          , pp.
          <fpage>82</fpage>
          -
          <lpage>97</lpage>
          . doi:
          <volume>10</volume>
          .4230/OASIcs.CMN.
          <year>2015</year>
          .
          <volume>82</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S.</given-names>
            <surname>Lahn</surname>
          </string-name>
          and
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Meister</surname>
          </string-name>
          .Einführung in die Erzähltextanalyse.
          <source>Stuttgart: Metzler</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mikolov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chen</surname>
          </string-name>
          , G. Corrado, and J.
          <source>DeanE.fÏcient Estimation of Word Representations in Vector Space</source>
          .
          <year>2013</year>
          . doi:
          <volume>10</volume>
          .48550/arXiv.1301.3781.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Nieuwland</surname>
          </string-name>
          and
          <string-name>
            <surname>J. J. A. van Berkum. “</surname>
          </string-name>
          <article-title>When peanuts fall in love: N400 evidence for the power of discourse”</article-title>
          .
          <source>InJ:ournal of cognitive neuroscience 18</source>
          .7 (
          <issue>2006</issue>
          ), pp.
          <fpage>1098</fpage>
          -
          <lpage>1111</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <source>doi: 10</source>
          .1162/jocn.
          <year>2006</year>
          .
          <volume>18</volume>
          .7.1098.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>F.</given-names>
            <surname>Pedregosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Varoquaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gramfort</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Michel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Thirion</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Grisel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Blondel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Prettenhofer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Weiss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Dubourg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Vanderplas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Passos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cournapeau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brucher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Perrot</surname>
          </string-name>
          , and
          <string-name>
            <surname>E. Duchesnay.</surname>
          </string-name>
          “
          <article-title>Scikit-learn: Machine Learning in Python”</article-title>
          .
          <source>IJno:urnal of Machine Learning Research 12.85</source>
          (
          <year>2011</year>
          ), pp.
          <fpage>2825</fpage>
          -
          <lpage>2830</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>P.</given-names>
            <surname>Qi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , J. Bolton, and
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          . “
          <article-title>Stanza: A Python Natural Language Processing Toolkit for Many Human Languages”. InP:roceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations</article-title>
          . Ed. by
          <string-name>
            <given-names>A.</given-names>
            <surname>Celikyilmaz</surname>
          </string-name>
          and
          <string-name>
            <given-names>T.-H.</given-names>
            <surname>Wen</surname>
          </string-name>
          . Online: Association for Computational Linguistics,
          <year>2020</year>
          , pp.
          <fpage>101</fpage>
          -
          <lpage>108</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2020</year>
          .acl-demos.
          <volume>14</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <given-names>D.</given-names>
            <surname>Schmidt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Zehe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lorenzen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Sergel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Düker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Krug</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Puppe</surname>
          </string-name>
          . “
          <article-title>The FairyNet Corpus - Character Networks for German Fairy Tales”</article-title>
          .
          <source>PIrno:ceedings of the 5th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage</source>
          ,
          <source>Social Sciences, Humanities</source>
          and Literature. Ed. by
          <string-name>
            <given-names>S.</given-names>
            <surname>Degaetano-Ortlieb</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kazantseva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Reiter</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Szpakowicz</surname>
          </string-name>
          . Punta Cana, Dominican Republic (online):
          <source>Association for Computational Linguistics</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>49</fpage>
          -
          <lpage>56</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2021</year>
          .latechclfl-
          <volume>1</volume>
          ..6
          <string-name>
            <given-names>M.</given-names>
            <surname>Schumacher</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Uglanova</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.</given-names>
            <surname>Giusd</surname>
          </string-name>
          .-
          <string-name>
            <surname>Romane-Romantik (</surname>
          </string-name>
          d-RoRo).
          <year>2022</year>
          . doi:
          <volume>10</volume>
          .52 81/zenodo.7215170.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <source>[18] S.-1</source>
          . Thompson.
          <article-title>Motif-index of folk-literature: a classification of narrative elements in folktales, ballads, myths, fables, mediaeval romances, exempla, fabliaux, jest-books, and local legends : A - C</article-title>
          . Vol.
          <volume>1</volume>
          . A - C. Indiana University studies ; Vol.
          <volume>19</volume>
          , No.
          <volume>96</volume>
          /97. Bloomington, Ind.: Univ. Libr.,
          <year>1934</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>R.</given-names>
            <surname>Řehůřek</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Sojka</surname>
          </string-name>
          . “
          <article-title>Software Framework for Topic Modelling with Large Corpora”</article-title>
          .
          <source>In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. Valletta</source>
          , Malta: Elra,
          <year>2010</year>
          , pp.
          <fpage>45</fpage>
          -
          <lpage>50</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>