Metadata annotation for dramatic texts Vincenzo Lombardo Rossana Damiano Antonio Pizzo CIRMA/Dipartimento di Informatica CIRMA/Dipartimento di Informatica CIRMA /Dipartimento Studi Umanistici Università di Torino Università di Torino Università di Torino vincenzo.lombardo@unito.it rossana.damiano@unito.it antonio.pizzo@unito.it Abstract metadata. Drama annotation projects, with the sets of metadata and annotations proposed in the sci- English. This paper addresses the prob- entific literature, rely upon markup languages and lem of the metadata annotation for dra- semantic encoding. matic texts. Metadata for drama describe Recently, there have been many approaches to the dramatic qualities of a text, connecting the annotation of stories (a larger set than drama, them with the linguistic expressions. Re- including general narrative, not exclusively con- lying on an ontological representation of veyed by characters performing actions). Annota- the dramatic qualities, the paper presents tions are going to enrich drama documents with a proposal for the creation of a corpus of appropriate metadata. Most of the approaches, annotated dramatic texts. e.g., the Story Workbench tool (Finlayson, 2011) Italiano. Questo articolo affronta il prob- and the DramaBank project (Elson, 2012), build lema dell’annotazione di metadati per i upon the linguistic expression of the story, typi- testi drammatici. I metadati per il dramma cally some natural language, and annotate story el- descrivono le qualità drammatiche di un ements, such as characters and conflicts, over the testo, connettendole alle espressioni lin- linguistic layer of part-of-speech tagging and ver- guistiche. Basandosi su una rappresen- bal frames. Other approaches are more detached tazione ontologica delle qualità dram- from the linguistic expression: they consider the matiche, l’articolo presenta una proposta cultural object of the story and rely on concep- per la creazione di un corpus di testi dram- tual models encoded in logic frameworks, e.g., matici annotati. the Contextus Project1 , the StorySpace ontology (Wolff et al., 2012). However, most projects work in an isolated 1 Introduction fashion: each approach provides its own annota- Drama annotation is the process of annotating tion schema, without connection with the general the metadata of a drama. Given a drama ex- knowledge, and do not provide the annotated doc- pressed in some medium (text for screenplays, au- uments with a clear status. This paper presents an diovisual for cinema, interactive multimedia for overview of the Drammar approach for the meta- videogames, etc., termed by Esslin “dramatic me- data annotation of dramatic texts: the gathering of dia”, i.e. media that display characters perform- such corpus is relevant for teaching drama through ing actions): the process of metadata annotation schematic charts (Lombardo et al., 2016b), in- identifies what are the elements that characterize forming models of automatic storytelling (Lom- the drama and annotates such elements in some bardo et al., 2015), preserving drama as an intan- metadata format. For example, in the sentence gible form of cultural heritage (Lombardo et al., “Laertes and Polonius warn Ophelia to stay away 2016a). We shortly review the current approaches, from Hamlet.”, the word “Laertes”, which refers before introducing the Drammar ontology under- to a drama element, namely a character, will be 1 http://www.contextus.net, visited on 7 July annotated as “Character”, taken from some set of 2017. lying the annotation schema. Then we describe the tailored content retrieval. Some initiatives also crowdsourcing initiative POP-ODE and the cur- rely on automatic annotation approaches, which rent development of the annotated corpus. Finally, can overcome the difficulties of recruiting anno- we briefly discuss the status of the annotated doc- tators, especially when minimal schemata targeted ument, before the conclusion. at grasping the regularities of written and oral nar- ratives at the discourse level can be worked out 2 Drama and annotation (Rahimtoroghi et al., 2014). A drama is a story conveyed through characters Here, we provide an overview of the Dram- who perform live actions: for example, theatrical mar approach6 , an ontology of drama, specifically plays (Shakespeare’s Hamlet), TV series (HBO’s conceived to annotate dramatic media (Lombardo Sopranos2 ), but even reality shows (CBS’s Sur- and Pizzo, 2014), that makes the knowledge about vivor3 ), and games (Ubisoft’s Assassin’s Creed drama available as a vocabulary for the linked in- 4 ). Metadata annotation for dramatic texts must terchange of annotations and readily usable by au- encode the major concepts and relations of the tomatic reasoners for implementing many tasks drama domain, which have been shared by a ma- (such as, e.g., the calculation of characters’ emo- jority of scholars in the drama literature. Here, tions (Lombardo et al., 2015)). we refer to the so–called dramatic qualities, that However, though convenient for its formal ac- is those elements that are necessary for the exis- count amenable to automatic reasoning, the use tence of a drama, which can be found in several of ontology editors and reasoning tools is chal- drama analyses, e.g. (Lavandier, 1994; Ryngaert, lenging for drama experts (Varela, 2016). For 2008; Hatcher, 1996; Spencer, 2002). All the ini- the accomplishment of the annotation task, it is tiatives on this topic have shared similar sets of crucial to provide a friendly environment with elements, namely story units, characters or agents, metaphors and interfaces that directly descend actions, intentions or plans, goals, conflicts, values from the drama scholarship, which abstracts the at stake, emotions. These elements are annotated annotator from the details of the ontology repre- in connection with media chunks (e.g., text para- sentation. Here we describe a pipeline and system graphs), often with the goal of constructing cor- for the metadata annotation of dramatic texts. pora of annotated narratives and the study of the relationships between the linguistic expression of 3 The Drammar ontology the story in the narrative and its content. In order to build a formal encoding of the dramatic Project DramaBank, which has proposed a tem- elements, Drammar resorts to a set of theories and plate based language for describing the narrative models that are well established in Artificial Intel- content of text documents, is a standalone down- ligence and Computer Science. Fig. 1 provides an loadable application relying on an internal, non- overview of the major classes and properties of the standardized representation format (Elson, 2012). ontology: on the left side, the timeline of incidents A media-independent model of story is provided grouped into units (upper part, left), connected by the OntoMedia ontology, exploited across dif- with the agents’ intentions (or plans, lower part, ferent projects (such as the Contextus Project5 ) to left) through the concept of Action (middle part, annotate the narrative content of different media left); on the right side, the hierarchical scene struc- objects, ranging from written literature to comics ture (upper part, right), connected to the patterns and TV fiction. In the field of cultural heritage for describing actions (lower part, right), which dissemination, the StorySpace ontology supports assign roles to agents; the middle of the figure museum curators in linking the content of art- describes the agent, with its conflicts (lower part, works through stories (Wolff et al., 2012), with the middle), and mental states (middle). Elements in ultimate goal of enabling the generation of user grey levels are referred on external references: List 2 http://www.hbo.com/the-sopranos, visited and Treenode, on top, from abstract data struc- on 21 July 2017 tures; SituationSchema, FramenetSchemata, and 3 http://www.cbs.com/shows/survivor/, vis- ited on 21 July 2017 DescriptionTemplate, on the left, from linguistic 4 https://www.ubisoft.com/en-US/game/ resources; Agent and Object from general upper assassins-creed/, visited on 21 July 2017 5 6 http://www.contextus.net, visited on 21 July https://www.di.unito.it/wikidrammar, 2017 visited on 15 October 2017 ● List ● TreeNode ● Timeline ● Scene hasTimeline hasTimeline hasChild Precondition ● Plan lists Effect ● Scene ● Scene ● Scene ● Consistent ● Consistent hasMember ● Unit StateSet StateSet spans ● State ● External hasMember ● Timeline hasMember Reference lists hasExtRef ● Action ● StateOfAffairs ● MentalState ● Unit ● Description ● Consistent ● Consistent hasMember Template StateSet lists StateSet ● Value ● Action Engaged ● Belief ● Goal ● Emotion hasPlan hasPlan atStake isDescribedBy Precondition Effect hasValue knows hasGoal feels T/F Engaged ● SituationSchema ● DirectlyExecutablePlan ● Scene hingesOn ● FramenetSchema lists isIntendedBy ● Agent ● ConflictSet ● DrammarScene hasRole hasMember ● Role ● AbstractPlan ● Plan ● Plan intends inConflictWith hasFiller T/F ● Agent ● Object ● Timeline accomplished motivates Figure 1: Major classes and properties of ontology Drammar ontology. DirectlyExecutablePlans, which directly contain actions. Goals originate from the values of the The Timeline is the closest element to the characters that are put at stake and need to be re- drama document (a literary text or an audio- stored (ValueEngaged), given the Beliefs (i.e. the visual medium), a succession of the incidents knowledge) of the agents. This level is formalized (or Actions) that happen in the drama. Inci- through the rational agent paradigm, or BDI (Be- dents are assembled into discrete structures, called lief, Desire, Intention) paradigm (Bratman, 1987) Units. Each succession of incidents forms a sub- (which is also applied in the computational story- timeline of the whole timeline of the drama. This telling community (Norling and Sonenberg, 2004) level is formalized through the Situation Calculus (Peinado et al., 2008). So, an agent is charac- paradigm (McCarthy, 1986): with sub-timelines terized by goals, beliefs, values engaged, emo- that function as operators advancing the story tions, and plans; values can be atStake (true) or world from one state to another (states aggregated in balance (atStake false); plans can be in conflict in ConsistentStateSets), that work as preconditions with other plans, possibly of other agents; a con- and effects of some sub-timeline of incidents. flict set aggregates all the plans, agents, and goals The actions result from the deliberation pro- that determine a dramatic scene (DrammarScene), cess of the characters, named Agents, which cen- through the game of alternate accomplishments. ters upon the notion of the character’s intention A plan motivates the existence of a (sub)timeline, in achieving (or trying to achieve) a Goal. The has preconditions and effects, which are consistent intention, or the commitment of the character, is sets of states, and can be accomplished or not. Fi- represented by a Plan, which consists of the ac- nally, scenes, defined by the author or perceived by tions that are to be carried out in order to achieve the audience, to appropriately segment the time- some goal; plans are organized hierarchically, with line, are recursively composed of daughter scenes. high-level behaviors (AbstractPlans) formulated A scene spans a timeline, that is a sequence of as lists of lower-level plans, or subplans, until the units. Some scenes are DrammarScenes, mean- feelings; it occurs after Polonius blesses Laertes ing that they are motivated by some conflict over on his departure and before Ophelia promises to the characters’ intentions, which is the characteri- avoid Hamlet. The bottom of the figure concerns zation of scenes according to the Drammar ontol- the plans that motivate such a unit. In particular, ogy. going from left to right, we see that, Ophelia (the The concepts and relations of the ontology agent or character shown at the left), who has the Drammar are written in the Semantic Web lan- goal of meeting Hamlet, has the plan of convinc- guage OWL (Ontology Web Language), in par- ing her father Polonius that Hamlet is reliable, and ticular, OWL2 RL (Rule Language), a syntactic this plan is in conflict with Polonius’ plan who and semantic restriction of OWL 2. This allows to wants to convince Ophelia that she is too candid address the problem of connecting drama knowl- for Hamlet. As we know from the following unit, edge with the general knowledge. In fact, since Polonius will succeed in convincing Ophelia, and Drammar includes classes that are intended as an actually Ophelia’s plan fail (see “accomplished? interface between the drama domain concepts and NO” at the far right). the linguistic and common sense types of knowl- The corpus of annotated drama documents cur- edge (see the grey boxes in Fig. 1), it is compliant rently consists of a small number of video and tex- with the paradigm of linked data (Heath and Bizer, tual drama documents, respectively (see table 1). 2011). Though we have not carried a thorough evaluation of the annotation, we have employed the anno- 4 Crowdsourcing annotation of drama tated documents in two applicative tasks: the first texts: the POP-ODE initiative is the calculation of the emotions felt by the char- POP-ODE consists of a pipeline and a number acters through automatic reasoning, on the basis of of tools for the accomplishment of the annotation the events and the intentions manually annotated task of metadata for dramatic texts. A web-based (Lombardo et al., 2015); the second is the realiza- interface supports the feeding of the tables of a tion of printed charts of the characters’ intentions, data base, built according to the tenets of ontol- aligned with the timeline of incidents (Lombardo ogy Drammar: story units, characters, actions, in- et al., 2016b), currently employed in the didactics tentions or plans, goals, conflicts, values at stake of drama writing at the University of Torino. We (emotions are calculated automatically from these are going to evaluate the appropriateness of Dram- data). The ontology axioms have been encoded mar on the adequacy of description from the point by the drama scholar (supported by the ontology of view of research on the humanities. engineers), through the well-known Protègè edi- The current corpus has been employed in the re- tor7 . A module converts the data base tables into alization of printed charts of the characters’ inten- an OWL file, actual a Drammar Instantiated On- tions aligned with the timeline of incidents (Lom- tology file (OWL DIO file). bardo et al., 2016b), the application of automatic Figure 2 shows the web interface for the annota- reasoning techniques to compute the emotions felt tion. The top of the figure shows the text selector: by the characters on the basis of the events and on the left, the Hamlet text from an authoritative the intentions manually annotated (Lombardo et source (Shakespeare’s navigators), on the right, al., 2015); the proposal of a model for the preser- the text chunk that pertains to the unit selected vation of drama as an intangible form of cultural below. The middle of the figure shows the unit heritage (Lombardo et al., 2016a), the encoding annotation, that is the actions that have been iden- of Stanislavsky’s Action Analysis, useful in per- tified by the annotator in the selected segment of spective for supporting actor rehearsals and drama the text, recognized as a bounded unit. On the left staging (Albert et al., 2016). and the right of the unit annotation are the previous Finally, we report a few considerations on the and the following unit in the story timeline, with status of a Drammar instantiated file, which con- the values that are at stake or at balance before tains an annotated drama text, by connecting the and after the current unit. So, in this example, the Drammar format with the widespread FRBR con- unit concerns Polonius that asks Ophelia about her ceptual model. The FRBR model (Functional Re- 7 http://protege.stanford.edu, visited on 15 quirements for Bibliographical Entities) (O’ Neill, October 2017. E. T., 2002), designed for capturing the seman- Figure 2: The web interface of the POP-ODE annotation: top) text selection; middle) unit annotation; bottom) intentions-goals-conflicts annotation. Medium Work Fragment Text Hamlet (Shakespeare) whole text (Arden book) Text Mutter Courage und Ihre Kinder (Brecht) whole text (in Italian - Einaudi) Text L’Arialda (Testori’s Italian neorealism) whole text (in Italian - Feltrinelli) Movie Apocalypse now helicopter attack scene (ride of valkyries) Movie Taxi driver “Are you talkin’ me?” scene Movie Matrix bullet time scene Movie La Dolce Vita Trevi fountain scene Movie The Clockwork Orange Flat Block Marina scene Movie Blade Runner “I’ve seen thinks ...” scene Movie The deer hunter Russian roulette scene Movie The Godfather Sollozzo omicide scene Movie The Snatch dog VS. rabbit scene Movie Kill Bill - Vol. 2 “losing the other eye” scene Musical video clip Taylor Swift’s “You belong with me’ 3-min video Advertisement clip “Zippo” lighter commercial 30-sec video Animation short Oktapodi 2:30-min video Table 1: Corpus of annotated drama documents. tics of bibliographic information, addresses the pliant with the FRBR model. We can have many abstract ideation (called Work, e.g., Beethoven’s manifestations of such a single expression, which idea of the Ninth Symphony), the encoding in however constrains units and timelines to remain a specific language such as the text (called Ex- unaltered. pression, e.g., Berliner Philarmoniker’s interpre- tation of the Ninth), the concrete representation 5 Conclusion (called Manifestation, e.g., some Berliner Philar- moniker’s recording of the Ninth), and a single In this paper, we have described the Drammar ap- instance (called Item, e.g., some published CD proach for the metadata annotation of dramatic of some Berliner Philarmoniker’s recording of the texts. We have described the Drammar ontol- Ninth). In our case, the instantiated OWL file is a ogy and the POP-ODE initiative for the annota- particular Expression of the underlying drama ab- tion pipeline for drama documents, together with straction (called Work, in FRBR terms), encoded the web-based annotation tool. We are going to in the ontological format. So, the original textual make a vast and effective test of the annotation tool document is an actual Manifestation of the onto- over several student classes, together with ques- logical linguistic Expression that is perfectly com- tionnaires and ethnographic observations, to eval- uate the functioning of the tool and to create a vast John C. McCarthy. 1986. Mental situation calculus. In corpus for studies in the digital humanities. Proceedings of the 1986 Conference on Theoretical Aspects of Reasoning About Knowledge, TARK ’86, pages 307–307, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc. References Giacomo Albert, Antonio Pizzo, Vincenzo Lombardo, E. Norling and L. Sonenberg. 2004. Creating Inter- Rossana Damiano, and Carmi Terzulli. 2016. active Characters with BDI Agents. In Proceedings Bringing authoritative models to computational of the Australian Workshop on Interactive Entertain- drama (encoding knebel’s action analysis). In In- ment IE2004. teractive Storytelling. 9th International Conference on Interactive Digital Storytelling, ICIDS 2016, vol- O’ Neill, E. T. 2002. Frbr: Functional requirements ume 10045, pages 285–297, Cham – CHE, Novem- for bibliographic records; application of the entity- ber 15–18. Springer International Publishing. relationship model to humphry clinker. Library Re- sourches and Technical Services, 46:150–158. Michael E. Bratman. 1987. Intention, Plans, and Practical Reason. Harvard University Press, Cam- F. Peinado, M. Cavazza, and D. Pizzi. 2008. Revis- bridge (MA). iting Character-based Affective Storytelling under a Narrative BDI Framework. In Proc. of ICIDIS08, David K. Elson. 2012. Dramabank: Annotating Erfurt, Germany. agency in narrative discourse. In Proceedings of the Eighth International Conference on Language Elahe Rahimtoroghi, Thomas Corcoran, Reid Swan- Resources and Evaluation (LREC 2012), Istanbul, son, Marilyn A. Walker, Kenji Sagae, and An- Turkey. drew Gordon. 2014. Minimal narrative annota- tion schemes and their applications. In AAAI Publi- Mark Alan Finlayson. 2011. The story workbench: An cations, Seventh Intelligent Narrative Technologies extensible semi-automatic text annotation tool. In Workshop. AAAI Publications, Workshops at the Seventh Arti- ficial Intelligence and Interactive Digital Entertain- Jean-Pierre Ryngaert. 2008. Introduction à l’analyse ment Conference. du théâtre. Collection Cursus. Série Littérature. Ar- mand Colin. Jeffrey Hatcher. 1996. The Art and Craft of Playwrit- ing. Story Press, Cincinnati, Ohio. Stuart Spencer. 2002. The Playwright’s Guidebook: An Insightful Primer on the Art of Dramatic Writing. T. Heath and C. Bizer. 2011. Linked data: Evolving Faber & Faber. the web into a global data space. Synthesis Lectures on the Semantic Web: Theory and Technology, pages Miguel Escobar Varela. 2016. Interoperable perfor- 1–136. mance research promises and perils of the semantic Yves Lavandier. 1994. La dramaturgie. Le clown et web. The Drama Review, 60(3). l’enfant, Cergy. Annika Wolff, Paul Mulholland, and Trevor Collins. Vincenzo Lombardo and Antonio Pizzo. 2014. Multi- 2012. Storyspace: A story-driven approach for cre- media tool suite for the visualization of drama her- ating museum narratives. In Proceedings of the 23rd itage metadata. Multimedia Tools and Applications, ACM Conference on Hypertext and Social Media, 75(7):3901–3932. pages 89–98. Vincenzo Lombardo, Cristina Battaglino, Antonio Pizzo, Rossana Damiano, and Antonio Lieto. 2015. Coupling conceptual modeling and rules for the an- notation of dramatic media. Semantic Web Journal, 6(5):503–534. Vincenzo Lombardo, Antonio Pizzo, and Rossana Damiano. 2016a. Safeguarding and accessing drama as intangible cultural heritage. ACM Journal on Computing and Cultural Heritage, 9(1):1–26. Vincenzo Lombardo, Antonio Pizzo, Rossana Dami- ano, Carmi Terzulli, and Giacomo Albert. 2016b. Interactive chart of story characters’ intentions. In Interactive Storytelling, 9th International Con- ference on Interactive Digital Storytelling, ICIDS 2016, Los Angeles, CA, USA, November 15–18, 2016, Proceedings, volume 10045, pages 415–418, Cham – CHE, November 15–18,. Springer Interna- tional Publishing.