<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Set of Annotations for supporting a TTS Application for Folktales</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Thierry Declerck</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Annotation of Proppian Functions</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Multilingual Technologies German Research Center for Artificial Intelligence, DFKI GmbH</institution>
        </aff>
      </contrib-group>
      <fpage>58</fpage>
      <lpage>63</lpage>
      <abstract>
        <p>In this short demonstration paper we present different layers of annotation for folktales we have been working on and which are in the process of being integrated in one set of annotations, which is mediated by a formal representation of the annotation elements in an ontological framework. We list in this short text the various modules of this annotation scheme. A main result of this work has been the implementation of a Text-to-Speech (TTS) application for folktales, which is the core of our demonstration system.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        In this short poster and demonstration paper we present work done in the context
of different software projects and bachelor or master theses conducted at
Saarland University and at DFKI. The goal of those efforts was to develop annotation
schemes that support efficient access to topics of interest in folktales for their
inclusion in applications. The resulting different layers of annotation for folktales
are in the process of being integrated in one set of annotations, which is mediated
by a formal representation of the annotation elements in an ontological framework.
We list in this short text the various modules of this annotation framework. A main
result of this work has been the implementation of a Text-to-Speech (TTS)
application for folktales. We also discuss briefly current work on extending the annotation
scheme, adding information from two very influential classification schemes used
in the international folkloristics.
A first approach to the annotation of folktales was pursued in the context of a
cooperation between the first instantiation of the CLARIN-D project1 and the past
Dutch NWO Amicus (Automated Motif Discovery in Cultural Heritage and
Scientific Communication Texts) project2. In this context, we developed an extended
annotation scheme for the annotation of folktales with Proppian functions3. The
work was based on a re-design of the coarse-grained scheme proposed by [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
The resulting scheme included additionally to the Proppian functions some textual
properties, temporal and dialog structures, as well as information on central
characters playing a role in the tales. Results of this work are described in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
This scheme was later used for supporting a first information extraction system
applied to tales.
3
      </p>
    </sec>
    <sec id="sec-2">
      <title>Ontology-driven automated Textual Analysis of Tales</title>
      <p>
        Building on the work described in section 2, an automated linguistic analysis of
tales was developed. The goal was not only to automatically detect characters of
the tales, but also to provide for a co-reference analysis such that the actions in
which the characters are involved can be fully specified, and thus helping for an
automated detection of Proppian functions, together with the involved personages.
Results of the analysis are stored in a database, which has been further developed
onto an ontological framework: Adding thus not only an annotation layer but also
a formal representation level (s. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]). The ontological representation allows also to
apply generalizations for the specification of the characters (human vs animals, or
supra natural etc.). The system was also able to operate reference resolution of the
kind: daughter can also be a sister etc. Figures 1 and 2 show screen-shots of this
ontological framework, as visualized by the Protégé ontology tools4.
      </p>
      <p>
        The decision to use an ontological framework turned out to be very useful,
since further work on distinct elements of a tales could be easily integrated. So for
example the work described in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] considered the detection of sentiments expressed
by the characters of the tales. Such sentiments (joy, happiness, sadness etc.) could
be added in a straightforward manner to instances of characters within the time
span in which they occur in the tales.
4
      </p>
    </sec>
    <sec id="sec-3">
      <title>Dialogue Structure Annotation for a TTS Output</title>
      <p>
        The detection of dialogue structures in folktales (and in fact in any narrative) is
essential in order to know who is “speaking” to whom, as an anchor for building
a Text-to-Speech (TTS) system applied to a folktale. This concerns also the
detection of the text passages in which a narrator is describing the events. The work
summarized in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] mainly addresses the issue of adding such a TTS functionality to
the automatic analysis of the text, as provided by the work described in section 3.
The TTS system accesses the instances of the characters in the populated ontology
2https://ilk.uvt.nl/amicus/
3These functions were introduced by Vladimir Propp in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
4See http://protege.stanford.edu/ for more details on those ontology tools.
(see Figure 2 as an example of a populated ontology), and can retrieve the
information on sentiment encoded there and correspondingly model the voice output of the
various characters. The TTS system that was used in this case is the Mary system,
which is described among others in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]5. Figure 3 displays a screen-shot of the
system showing the kind of information that is extracted from the annotation for
allowing the generation of the TTS output. So for example the information on who
is “speaking” (the narrator or one of the character), and which is encoded by the
feature “ID”. The reader can see the attributes that are associated with the detected
character (for example for ID 1 and ID 2. ID -1 is the narrator). It is worth noting
that those attributes are stored in the ontology, and some of the attributes are in
fact inferred from the text processing. Also the sentiments of the “speakers” are
encoded and displayed in this screen-shot within angle brackets.
5
      </p>
    </sec>
    <sec id="sec-4">
      <title>On-going Work</title>
      <p>
        Very recently, we started also looking at other metadata to be used for
annotating folktales, and to see how to integrate those with the Proppian functions. We
looked for this at the well-known classification systems of Stith Thompson ([
        <xref ref-type="bibr" rid="ref6">6</xref>
        ],
5More details on Mary are given in http://mary.dfki.de/
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]), which is categorizing motifs used in folktales, Antti Aarne ([
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]) and
HansJörg Uther ([
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]), which are dealing with the types of folktales, and we are starting
to integrate those models in our ontology. Additionally we linked the detected
characters to WordNet, investigating if this can help for the disambiguation of such
characters (see [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]).
6
      </p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>We presented past and on-going work for providing for annotations for folktales
that are supporting specific NLP-based applications. In the course of the different
software projects and bachelor or master theses dedicated to this effort we
discovered that the use of ontologies is a crucial element for integrating the various
elements of the different annotation layers. The successful implementation of a
TTS system applied to folktales provided for a proof of concept. We are currently
extending our integration work with the ontologization of influential classification
schemes in the field of folkloristics.
Ontology-Based Iterative Text Processing Strategy for Detecting and
Recognizing Characters in Folktales. In Proceedings of the Digital Humanities
2012 Conference, Hamburg, Germany.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Declerck</surname>
          </string-name>
          , Thierry, Scheidel, Antonia and Lendvai,
          <string-name>
            <surname>Piroska</surname>
          </string-name>
          (
          <year>2011</year>
          )
          <article-title>Proppian Content Descriptors in an Integrated Annotation Schema for Fairy Tales</article-title>
          .
          <source>In Selected Papers from the LaTeCH Workshop Series,Theory and Applications of Natural Language Processing</source>
          , pp.
          <fpage>155</fpage>
          -
          <lpage>169</lpage>
          , Heidelberg:Springer.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Scheidel</surname>
          </string-name>
          , Antonia and Declerck
          <string-name>
            <surname>Thierry</surname>
          </string-name>
          (
          <year>2010</year>
          )
          <article-title>APftML - Augmented Proppian fairy tale Markup Language</article-title>
          .
          <source>In Proceedings of the First International AMICUS Workshop on Automated Motif Discovery in Cultural Heritage and Scientific Communication Texts</source>
          , Vienna, Austria.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Koleva</surname>
          </string-name>
          , Nikolina, Declerck, Thierry and Krieger,
          <string-name>
            <surname>Hans-Ulrich</surname>
          </string-name>
          (
          <year>2012</year>
          ) An
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Eisenreich</surname>
          </string-name>
          , Christian, Ott, Jana, Süßdorf, Tonio, Willms, Christian and Declerck,
          <string-name>
            <surname>Thierry</surname>
          </string-name>
          (
          <year>2014</year>
          )
          <article-title>From Tale to Speech: Ontology-based Emotion and Dialogue Annotation of Fairy Tales with a TTS Output</article-title>
          .
          <source>In Proceedings of ISWC 2014, Riva del Garda</source>
          , Italy.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Thompson</surname>
          </string-name>
          ,
          <string-name>
            <surname>Stith</surname>
          </string-name>
          (
          <year>1977</year>
          )
          <article-title>The Folktale</article-title>
          . Berkeley: University of California Press.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Thompson</surname>
          </string-name>
          ,
          <string-name>
            <surname>Stith</surname>
          </string-name>
          (
          <year>1955</year>
          )
          <article-title>Motif-index of folk-literature: A classification of narrative elements in folktales, ballads, myths, fables, medieval romances, exempla, fabliaux, jest-books, and local legends</article-title>
          .
          <source>Revised and enlarged edition</source>
          . Bloomington: Indiana University Press.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Antti</given-names>
            <surname>Aarne</surname>
          </string-name>
          (
          <year>1961</year>
          )
          <article-title>The Types of the Folktale: A Classification and Bibliography</article-title>
          .
          <source>Helsinki: The Finnish Academy of Science and Letters.</source>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Hans-Jörg Uther</surname>
          </string-name>
          (
          <year>2004</year>
          )
          <article-title>The Types of International Folktales: A Classification and Bibliography. Based on the system of Antti Aarne and Stith Thompson</article-title>
          .
          <source>In FF Communications no. 284286</source>
          . Helsinki: Suomalainen Tiedeakatemia.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Declerck</surname>
          </string-name>
          , Thierry, Tyler, Clement and Kostova,
          <string-name>
            <surname>Antonia</surname>
          </string-name>
          (
          <year>2016</year>
          )
          <article-title>Towards a WordNet based Classification of Actors in Folktales</article-title>
          .
          <source>In Proceedings of the Eighth Global WordNet Conference</source>
          , Bucharest, Romania.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Propp</surname>
          </string-name>
          ,
          <string-name>
            <surname>Vladimir</surname>
          </string-name>
          (
          <year>1968</year>
          )
          <article-title>Morphology of the folktale</article-title>
          . University of Austin:Texas Press.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Malec</surname>
          </string-name>
          ,
          <string-name>
            <surname>Scott</surname>
          </string-name>
          (
          <year>2001</year>
          )
          <article-title>Proppian structural analysis and XML modeling</article-title>
          .
          <source>In Proceedings of CLiP</source>
          . Duisburg, Germany.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Schröder</surname>
          </string-name>
          , Marc, Charfuelan, Marcela, Pammi, Sathish and Steiner,
          <string-name>
            <surname>Ingmar</surname>
          </string-name>
          (
          <year>2011</year>
          )
          <article-title>Open source voice creation toolkit for the MARY TTS Platform</article-title>
          .
          <source>In Proceedings of Interspeech. Florence</source>
          , Italy.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>