<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Once More, With Feeling: Measuring Emotion of Acting Performances in Contemporary American Film</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Naitian</forename><surname>Zhou</surname></persName>
							<email>naitian@berkeley.edu</email>
							<affiliation key="aff0">
								<orgName type="department">School of Information</orgName>
								<orgName type="institution">UC Berkeley</orgName>
								<address>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">David</forename><surname>Bamman</surname></persName>
							<email>dbamman@berkeley.edu</email>
							<affiliation key="aff0">
								<orgName type="department">School of Information</orgName>
								<orgName type="institution">UC Berkeley</orgName>
								<address>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Once More, With Feeling: Measuring Emotion of Acting Performances in Contemporary American Film</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">0D5C63CCEBB5454A85899FEB4413C19E</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T19:50+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>film</term>
					<term>performance</term>
					<term>computational film analysis</term>
					<term>speech emotion recognition</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Narrative film is a composition of writing, cinematography, editing, and performance. While much computational work has focused on the writing or visual style in film, we conduct in this paper a computational exploration of acting performance. Applying speech emotion recognition models and a variationist sociolinguistic analytical framework to a corpus of popular, contemporary American film, we find narrative structure, diachronic shifts, and genre-and dialogue-based constraints located in spoken performances.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Film is rich in its supply of semiotic resources, communicating meaning from the interaction of language (encoded in a script), visuals (choices of composition, blocking, cinematography), sound and more. Much computational work has arisen to examine slices of this semiotic field, including measuring how gender stereotypes or plot arcs are reflected in dialogue <ref type="bibr" target="#b33">[34,</ref><ref type="bibr" target="#b28">29,</ref><ref type="bibr" target="#b13">14]</ref> or how visual features like color variance and shot length constitute genre <ref type="bibr" target="#b24">[25,</ref><ref type="bibr" target="#b10">11]</ref>. One critical area, however, that has been neglected in this study is the role of performance in creating meaning.</p><p>As Naremore <ref type="bibr" target="#b17">[18]</ref> notes, film is a medium in which meaning is acted out; an acting performance provides a semiotic frame through which we can understand the events that unfold. Given the fixed text of a script, the rendering of the final performance is an interpretive process in which the actor, director and editor jointly imbue the words with additional meaning. In this view, the same line of dialogue exhibits variation in meaning when performed in distinct diegetic contexts. As one example, consider the following line in Knives Out (2019): "I'm warning you. "</p><p>Much of the film revolves around these three words, overheard in a conversation between the wealthy Harlan Thrombey and his grandson, Ransom. The line is uttered by multiple char-acters as the film unfolds: angrily shouted by Ransom, somberly recalled by the eavesdropper, and gleefully recounted by inspector Benoit Blanc upon solving the crime. Even a single line, located within a single diegetic event, has great capacity for meaning-making in performance.</p><p>When viewed in this light, we can apply the analytical framework from variationist sociolinguistics to better understand this space of performance. Given a fixed line of dialogue (equivalent to a linguistic variable), a performance entails a choice -a selection from the set of possible variants. It is this choice, and the meaning contained within, which we study.</p><p>In this work, we design computational models to explore this form of variation by considering the emotional range of performances in contemporary American film, exploring in particular the tension between what characters say and how they say it. As distinct from prior work in the computational humanities that has measured emotion from text alone <ref type="bibr" target="#b13">[14,</ref><ref type="bibr" target="#b14">15]</ref>, we measure acted emotion from speech, allowing us to disentangle the emotion present in the script from the choices made in creating the performance.</p><p>Using a speech emotion recognition model, we construct a parallel dataset of spoken performances (utterances) aligned with the text of the words being spoken (dialogue phrases). This dataset allows us to isolate and examine how performances vary in meaning from their paralinguistic features in addition to the textual meaning of the screenplay. We use this dataset to carry out several case studies exploring variation in performance in American film:</p><p>1. First, we carry out a structural analysis of emotion as performed over narrative time. Doing so allows us to characterize film as performance text, relating emotional performance to larger narrative structure. 2. Second, we study diachronic variation by comparing emotionality of films across release years, testing the degree to which performances have intensified over time (following Bordwell's theories of visual style <ref type="bibr" target="#b3">[4]</ref>). 3. Finally, we examine the capacity for performance by constructing a novel measure of emotional range for an utterance-the space of possible emotions that can be performed.</p><p>In doing so, we demonstrate how both contextual (genre) and textual (dialogue) aspects of film can carry constraints and affordances on acting performance.</p><p>In this work, we use computational methods to survey how both textual and contextual variables inform and reflect the performances rendered on screen.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Methods</head><p>In order to perform our analysis, we need to construct an aligned dataset of actor performances (utterances), the text of the words they speak (phrases), and the emotions in each utterance. We create a pipeline that takes as input a set of full-length movies, and outputs time-aligned transcriptions for utterances, their emotion labels, and groups of semantically similar phrases.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Preprocessing pipeline</head><p>We first construct a data pipeline to segment and transcribe utterances from movie dialogues. The pipeline takes as input a set of MP4 files, where each file is one digitized film. Our analysis takes place in the speech and text modalities, so we use ffmpeg to extract the audio track.</p><p>We use the pyannote<ref type="foot" target="#foot_0">1</ref> segmentation model to detect continuous, single-speaker speech segments. Then, we use faster-whisper<ref type="foot" target="#foot_1">2</ref> to transcribe each speech segment. Because pyannote speech segments are based on voice activity detection and silences, it can label extended, multisentence speech as a single segment. For our analysis, however, we are interested in utterances as a discursive unit. If a character makes an assessment, then poses a question, we would like to split these into two distinct utterances. As a middle ground between raw voice activity detection and segmenting discursive units, which requires complex conversational understanding, we perform a post-processing step where we further split speech segments by sentence boundaries derived from the transcriptions. Because whisper is an end-to-end model that does not produce fine-grained time alignments, we then use a speech-to-text fine-tuned wav2vec2 <ref type="foot" target="#foot_2">3</ref> to perform word-level time alignment between the transcription and the audio, then split the audio based on sentence boundaries generated by syntok,<ref type="foot" target="#foot_3">4</ref> a fast, rule-based sentence segmenter.</p><p>To prevent the end credit sequences from interfering with the results, we detect when the end credits begin by performing optical character recognition (OCR) on the shots in a movie and identifying long continuous sequences of shots that contain large amounts of text. We trim the movie to the beginning of the end credits.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Speech emotion recognition</head><p>To perform speech emotion recognition, we use a wav2vec2 large model without any taskspecific fine-tuning to extract audio features. Then, we train a classification head to perform seven-way emotion classification, based on the Ekman model <ref type="bibr" target="#b8">[9]</ref> of six basic emotions (anger, disgust, happiness, sadness, fear, and surprise) and a neutral label. To train these models, we use the MELD dataset, which contains 1,000 sampled dialogues from the TV series Friends <ref type="bibr" target="#b23">[24]</ref>.</p><p>We experiment with two classification settings: an utterance-level model which makes predictions based on only the speech features of the input utterance and a conversation-level model which includes the speech features of both the input utterance and its surrounding utterances.</p><p>In both cases, we use a pretrained wav2vec2 model as a backbone model for generating vector representations of each utterance. Because wav2vec2 creates an embedding for each audio frame (roughly 20ms of speech), we follow prior work in computing utterance embeddings by averaging across all timestamps within an utterance <ref type="bibr" target="#b20">[21]</ref>. We compute embeddings from the attention activations at each layer of the wav2vec2 model instead of just taking the last-layer activations; prior work has shown that, for paralinguistic tasks such as emotion recognition, early-and intermediate-layer activations are more useful than later layers <ref type="bibr" target="#b20">[21,</ref><ref type="bibr" target="#b29">30]</ref>. At the end of the embedding step, each utterance is represented by a set of 25 768-dimensional vectors.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.1.">Utterance-level emotion recognition</head><p>We implement the utterance-level model from Pepino, Riera, and Ferrer <ref type="bibr" target="#b20">[21]</ref> and match the reported performance. We first take a weighted average of layer activations for a given utterance; these weights are learned during training. Then, we apply a fully-connected classification head to produce a probability distribution over the seven emotion labels. Unlike the original paper, we do not use features from the initial convolutional layer of the pre-trained model; we use only the attention head activations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.2.">Contextual emotion recognition</head><p>We also train an contextual model which uses a bidirectional LSTM to predict the emotion of utterances within the context of a conversation. To do so, we define conversations as groups of utterances where each occurs within 3 seconds of the next. In the MELD dataset, there are 1,478 conversations in the training split according to this criterion.</p><p>For each conversation, we predict the emotion labels of all utterances in the conversation by first passing weighted activations through the biLSTM before applying the classification head to each hidden state. As before, the weights of activations are learned during training.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.3.">Evaluation</head><p>We expect the movie data to be similar in nature to the MELD dataset, since both consist of professionally produced and acted clips. However, to ensure that our models do not experience domain shift despite the greater range in release year and setting of the film corpus, we evaluate these models on the test split of the MELD dataset as well as a manually collected dataset consisting of 333 clips from a subset of 35 contemporary American films. Each clip was a conversation with at least 2 utterances, where conversations were identified with the same heuristic used to construct training data for the contextual model. This resulted in a final evaluation dataset of 2,157 utterances with emotion labels. The clips were labeled by two annotators: 51 clips were labeled by both annotators and 250 clips were labeled by a single annotator. The Krippendorff's 𝛼 between the two annotators was 0.334, and the Fleiss' 𝜅 was 0.333, which matches the agreement of the MELD dataset.</p><p>Table <ref type="table">1</ref> shows the evaluation results on the MELD and Movies datasets. The models perform comparably to each other, and comparably across evaluation datasets. This performance also approaches the state of the art on MELD for audio-only models. Because the performance of the contextual model is slightly higher, we use its inference outputs for our analysis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 1</head><p>Model comparison for MELD test dataset and our movies dataset, along with 95% bootstrap confidence interval bounds. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>MELD Movies</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Accuracy</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.">Identifying dialogue phrase groups</head><p>One powerful aspect of this dataset is that we align actor performances to the words that they speak. To account for variation in how highly semantically similar phrases can be realized, we cluster together phrases with high semantic similarity. We use the sentence-transformers library to compute sentence embeddings of utterances and cluster them with the Leiden community detection algorithm <ref type="bibr" target="#b30">[31]</ref>. Table <ref type="table" target="#tab_1">2</ref> shows some examples of phrases that are grouped together. We expect the phrases in each group to have similar prior distributions of emotion. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Analysis</head><p>The above pipeline measures the emotions performed in an utterance and ties each utterance to the text being spoken. We apply this methodology to a large corpus of contemporary, popular American films <ref type="bibr" target="#b0">[1]</ref> in order to study the variation of emotion within and between them.</p><p>Our corpus consists of the top-50 live-action, narrative films by U.S. box ofÏce from 1980-2022. We supplement these with films nominated for "Best Picture"-equivalent awards by one of six organizations in those years: Academy Awards, Golden Globes, British Academy of Film and Television Arts, Los Angeles Film Critics Association, National Board of Review, and National Society of Film Critics. We only include English-language films in this analysis, resulting in a total of 2,283 films.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Film as performance text</head><p>Plantinga <ref type="bibr" target="#b21">[22]</ref> describes how emotionality can reflect narrative structure -emotionallycharged events can serve as catalyst to disrupt the expository "stable state" and set the narrative in motion. Much attention has been paid to characterizing narratives in literature and film in terms of emotionality using trajectories of sentiment <ref type="bibr" target="#b12">[13]</ref> and emotion <ref type="bibr" target="#b25">[26,</ref><ref type="bibr" target="#b13">14,</ref><ref type="bibr" target="#b11">12,</ref><ref type="bibr" target="#b31">32]</ref>; these have focused on the emotion encoded in text. Audiences of movies, however, are not directly exposed to that text; their experience is mediated by the performance. In order to study this more directly, we turn our attention to characterizing narratives with emotion as performed.</p><p>We study the distribution of emotions in utterances over the course of a movie. How do the prevalence of emotions shift over narrative time? Similar to previous work on dialogue in screenplays, we ask if there are emotional regularities across films <ref type="bibr" target="#b11">[12]</ref>. We examine first the emotionality of utterances -the average probability that an utterance is not neutral -  before looking more closely at how specific emotions are distributed temporally. We plot the average probability of an emotion label for an utterance in intervals of 5 percent, expressed as a percentage of the full run-time of the film. Specific emotions are measured as proportions of the emotional labels, excluding the neutral label.</p><p>We find that the emotional trajectories of performances are, in fact, structured over narrative time. Figure <ref type="figure" target="#fig_0">1a</ref> shows that emotionality increases over narrative time. We examine also the trajectory of specific emotions across films (figs. 1b,1c,1d). We find that joyful performances follow a U-shaped curve, with a steep increase towards the end, as movies resolve. Like Hipson and Mohammad <ref type="bibr" target="#b11">[12]</ref>, we find that negative-valence emotions like sadness and anger decrease at the end. Further, anger peaks 85% into the film, reminiscent of a climax-resolution structure.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Evolving emotionality</head><p>Subscribing to a particular categorization of emotions can be restrictive; in the remainder of the paper, we explore emotional performance, but depart from analyzing specific emotional labels. First, we study performance at a less granular level, focusing on the concept of emotion- ality as the proportion of utterances with any emotion label. <ref type="foot" target="#foot_4">5</ref> We measure how emotionality has changed historically over the decades spanned by our corpus. Emotional shifts have been identified in English fiction books: Morin and Acerbi <ref type="bibr" target="#b16">[17]</ref> find that the content of those stories have experienced a decline in emotional expression. Within cinema, David Bordwell has written about how shorter shot lengths and tighter framing serve to intensify the visual style in more recent films compared to earlier ones <ref type="bibr" target="#b3">[4]</ref>. We ask whether there is a similar shift in performance: is there an intensification of emotion that matches the visual intensification of film, or perhaps an emotional cooling in performance that matches the findings in English fiction? When we split the data by release year, we find a mild effect that earlier films have a higher proportion of emotional utterances compared to later ones, with emotionality hitting a minimum around 2010 (see Fig. <ref type="figure" target="#fig_1">2</ref>). However, the question remains whether the emotional content is changing (as Morin and Acerbi find in literature) or if the style with which words are being uttered is changing.</p><p>To disentangle the effects of shifting content and shifting style, we consider the change in emotionality over the years within the semantically equivalent phrase groups. If it is indeed the writing, and not the performance, that drives this shift in emotionality, we should see little change within a phrase group. However, when we look at the 511 phrases that are used in all 43 years of the dataset, we find that a fixed-effects regression shows a slightly negative, statistically significant correlation between the year and emotionality even within phrase groups (𝑅 2 = 0.048, 𝐹 (1, 21461), 𝑝 &lt; 0.001).</p><p>Though this result is seemingly at odds with Bordwell's finding that visual style intensifies, it is also possible that they are harmonious. In Hollywood film, the close-up shot has always been associated with emotional expression <ref type="bibr" target="#b19">[20]</ref>. Panovsky <ref type="bibr" target="#b18">[19]</ref> writes that close-ups provide a rich "field of action" that affords nuanced acting performances. These visual performances, which are almost imperceptible if viewed from a natural distance, provide an alternative to the spoken word as a channel of expression. Comparing to the stage, Panovsky writes the spoken word makes a stronger impression "if we are not permitted to count the hairs in Romeo's mustache. " As cinema further grows into its medium, Bordwell finds that close-up shots have indeed grown tighter on the subject. With an increase in the capacity for more nuanced performance in the visual channel, the emotionality of the spoken word need not bear so strong a burden.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Measuring emotional range</head><p>Range is often said to be the mark of a great actor. Naremore writes about the importance of an actor splitting their character "visibly into different aspects", showing off emotional range <ref type="bibr" target="#b17">[18]</ref>. Kuleshov similarly stressed the actors must be able to create a full range of gestures to create complex meaning <ref type="bibr" target="#b15">[16]</ref>. Wilson <ref type="bibr" target="#b32">[33]</ref> takes this a step further and argues that the hallmark of great acting is projecting a character into complex situations. In this section, we explore the limits of range through the constraints that genre and script impose on emotional performance.</p><p>For this analysis, we construct a general measure of emotional range across a set of utterances 𝑢 1…𝑛 . We characterize each utterance 𝑢 𝑖 with a performance vector ⃗ 𝑣 𝑖 , which is a distribution over emotions, given by the predicted probability distribution from the speech emotion recognition model. This allows us to take a more nuanced view of performances as a mixture of emotions. We model the distribution from which the vectors ⃗ 𝑣 1…𝑛 are drawn as a Dirichlet, and find the parameters which maximize the likelihood of the observed vectors. We define emotional range as the entropy of this distribution: a higher entropy means there is greater variance in the distribution of performances, and a lower entropy signals lower emotional range.</p><p>One criticism of the Ekman emotional model lies in its construct validity: seven discrete emotion labels may be insufÏcient to characterize the space of emotions. Ideally, we would model a continuous space of "performance". In our previous analyses, we use these emotion labels as an intermediate between that ideal on one end, and sentiment analysis on the other. Here, our measure of emotional range is agnostic to the meaning of specific emotion labels, and serves to demonstrate how emotion classification can be a useful proxy task through which we can analyze performance in a more continuous space.</p><p>Thrillers have the least range; family-friendly films have the most. <ref type="bibr">Wilson [33]</ref> speculates that some genres, like some types of comedy, have less capacity for emotional range than others. Previous work has shown that emotional arcs are correlated with genre <ref type="bibr" target="#b27">[28]</ref>. We ask whether different genres are associated with different capacities for emotional performance.</p><p>We calculate the emotional range for each movie, and find the average score for each genre. Genre information comes from IMDB, and we exclude genres with fewer than 30 films in our dataset. <ref type="foot" target="#foot_5">6</ref> Figure <ref type="figure" target="#fig_2">3</ref> shows the average entropy across genres. Thrillers, biographies, and mysteries have the least emotional range; fantasy, musicals and family films rank highest. While 11. <ref type="bibr" target="#b3">4</ref>   it is difÏcult to attribute these results to a particular property of specific genres, these findings show that some genres have more constrained or consistent emotional registers than others.</p><p>Functional phrases have less capacity for emotional range. Naremore <ref type="bibr" target="#b17">[18]</ref> references Goffman when theorizing about performance: actors draw on and play against the interactional norms with which we as audience are already familiar. We ask if this bears out in our data. Does the emotional range of dialogue phrases reflect their discursive properties?</p><p>To study this, we measure the emotional range in dialogue. Because we tie specific performances to the words that are spoken, we can identify instances across the corpus when a given phrase was uttered. We isolate the 2,656 phrase groups that are uttered at least 50 times across our dataset. For each phrase group, we calculate the emotional range of its utterances.</p><p>Table <ref type="table" target="#tab_4">3</ref> shows phrases with the highest and lowest entropies. By inspecting the phrases at either end of the spectrum, we find qualitative differences in the kinds of phrases that have higher and lower emotional range: the capacity for emotional variance reflects the discursive flexibility of the words being spoken. Phrases with low range are functional and generally part of highly directed interactions: most phrases are either yes-or-no questions or answers to them. Phrases with high emotional range, on the other hand, mostly have more open-ended, evaluative discursive functions. In these cases, the prosody or intonation of speech can easily lend color to the statement being made. "You're alive" can be said with joy or relief to a loved one, as Marty McFly to his mentor Doc in Back to the Future (1985), or with anger at the sight of an enemy, as Lord Norinaga greets Walker in Teenage Mutant Ninja Turtles III (1993). </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Discussion and limitations</head><p>With this work, we demonstrate that films can, and should, be studied as performance texts. We tie our findings to both film theory and other computational work on narratives. Here, we discuss some limitations of the current study.</p><p>Measuring emotions. We follow a vast body of previous work within natural language processing <ref type="bibr" target="#b35">[36,</ref><ref type="bibr" target="#b34">35]</ref>, affective computing <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b5">6]</ref> and computational literary studies <ref type="bibr" target="#b27">[28,</ref><ref type="bibr" target="#b2">3]</ref> in using Ekman's basic emotions. However, the validity of this model has been questioned <ref type="bibr" target="#b22">[23]</ref>. First, there are doubts about the ecological validity of emotion recognition, especially as most speech emotion recognition datasets contain acted emotion as opposed to natural emotion. We note that, unlike much affective computing work, we use emotion recognition models trained on acted speech to make inference on acted speech. The professionally-produced, acted speech in the MELD dataset is well-suited to our data, which is also professionally-produced and acted. Indeed, we find that performance is similar between MELD and our in-domain evaluation data.</p><p>Another criticism lies in the cultural relativity of emotion. Though Ekman argues that the basic emotions are universal, he acknowledges there may be cultural differences in the emotions elicited in a given context. It is reasonable to suppose that viewers' normative knowledge also influences the interpretation of these emotions. We focus on contemporary American film in both our analysis and training data, holding at least the intended cultural audience constant. Cultural variation in performance is a ripe area for future work, as cultural differences exist in not only the production and interpretation of emotion, but also in theories of acting.</p><p>Aside from these specific criticisms of the Ekman model, the low interannotator agreement in both our evaluation set as well as other datasets, including MELD, suggest that this model for emotion may remain too coarse to precisely describe the data. Work in both affective psy-chology and NLP have attempted to address this by using more fine-grained classes <ref type="bibr" target="#b7">[8,</ref><ref type="bibr" target="#b6">7]</ref> or a continuous spaces of emotion <ref type="bibr" target="#b26">[27,</ref><ref type="bibr" target="#b6">7]</ref>. While we used the Ekman model due to the availability of training data as well as to provide comparison with previous studies of emotion narratives, alternative emotion models may prove useful in future work.</p><p>A question of authorship. In cinema, the performance that audiences see on screen is cocreated by the actor, the director, and the editor. Baron and Carnicke <ref type="bibr" target="#b1">[2]</ref> describe the conventional wisdom within film analysis to be that cinematic performances are made in the cutting room. "True" acting happens on the stage. Though our work studies film as performance text, it does not disentangle the processes through which the performance is constructed. It is about the performance as viewed, but not about the choices made by actors as separate from the director or editor. Our work makes the point that performance carries meaning worth studying, and opens the door for future computational work that explores its authorial roots.</p><p>Embodied erformance. Finally, we examine only performance as enacted through speech. This is perhaps the modality that lies closest to the script, and allows us to apply a variationist approach to studying the relationship between performance and text, but of course performance includes not just speech but also gesture, posture, facial expression, and more. Visual description has been found to be more useful for aligning narrative events than dialogue <ref type="bibr" target="#b36">[37]</ref>, and quantitative analysis of theater performance has found narratively meaningful patterns in movement <ref type="bibr" target="#b9">[10]</ref>. Film is a multimodal medium that deserves analysis in all its modalities. We hope that our work examining film across the speech and text can serve as a basis for more work that examines performance as embodied visually.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusion</head><p>In this paper, we explore the relation between film as narrative text and as performance text. Using a novel parallel dataset of speech and text from popular contemporary American film, we develop computational methods to measure how emotional prevalence and emotional range vary by both textual factors of narrative time and dialogue, as well as contextual factors of release year and genre. We hope this work inspires further multimodal studies of performance in computational film analysis.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Emotionality increases, but specific emotions show non-linear trajectories over narrative time (95% bootstrap confidence interval).</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Emotionality is higher in older films (95% bootstrap confidence interval).</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Relative emotional range for different film genres (95% bootstrap confidence intervals).</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>Examples of utterances which are clustered into dialogue phrase groups.</figDesc><table><row><cell>Phrase Groups</cell></row><row><cell>"Let's go, let's go, let's go!", "Let's go, let's go!", "Let's go right now go go", "Go,</cell></row><row><cell>let's go, let's go. ", "Okay guys, let's go. "</cell></row></table><note>"Oh, pleasure to meet you. ", "It's so nice to finally meet you. ", "It is a pleasure to finally meet you. ", "Oh, it's nice to meet you. ", "It's so nice to meet you!"</note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head></head><label></label><figDesc>11.2 11.0 10.8 10.6 10.4 10.2 10.0 9.8</figDesc><table><row><cell>Genre</cell><cell>Thriller Biography Mystery Crime Sci-Fi Action Romance History Drama Music Family Musical Fantasy Horror Adventure War Sport Comedy</cell></row><row><cell></cell><cell>Emotional Range</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 3</head><label>3</label><figDesc>Dialogue phrase groups with the highest and lowest emotional range scores. The table shows a representative phrase from each group.</figDesc><table><row><cell>Low Emotional Range</cell><cell></cell><cell>High Emotional Range</cell><cell></cell></row><row><cell>Phrase</cell><cell cols="2">Entropy Phrase</cell><cell>Entropy</cell></row><row><cell>"Could I ask you something?"</cell><cell>-17.02</cell><cell>"All rise. "</cell><cell>-7.85</cell></row><row><cell>"This is your captain speaking. "</cell><cell>-16.56</cell><cell>"Are you out of your mind?"</cell><cell>-7.88</cell></row><row><cell>"Is that okay?"</cell><cell>-16.53</cell><cell>"What the fuck wrong with you?"</cell><cell>-7.99</cell></row><row><cell>"Can I get something for you?"</cell><cell>-16.17</cell><cell>"You're alive. "</cell><cell>-8.32</cell></row><row><cell>"Can I get something to drink?"</cell><cell>-16.16</cell><cell>"You saved my life. "</cell><cell>-8.34</cell></row><row><cell>"Hey, what can I get you?"</cell><cell>-15.91</cell><cell>"Don't you understand?"</cell><cell>-8.34</cell></row><row><cell>"You wanna come?"</cell><cell>-15.65</cell><cell>"Don't be so afraid. "</cell><cell>-8.39</cell></row><row><cell>"Yeah, that's good. "</cell><cell>-15.36</cell><cell>"You son of a bitch. "</cell><cell>-8.40</cell></row><row><cell>"Any questions?"</cell><cell>-15.26</cell><cell>"You scared the shit out of me!"</cell><cell>-8.44</cell></row><row><cell>"That's correct. "</cell><cell>-15.25</cell><cell>"Ow. "</cell><cell>-8.44</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://huggingface.co/pyannote/speaker-segmentation</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">https://github.com/SYSTRAN/faster-whisper</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">https://huggingface.co/facebook/wav2vec2-base-960h</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">https://github.com/fnl/syntok</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_4">The model achieves an F1 score of 0.69 on the neutral label.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_5">Three genres were excluded: Western (19 films), Documentary (3), and Animation<ref type="bibr" target="#b1">(2)</ref> </note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>The research reported in this article was supported by funding from Mellon Foundation and the National Science Foundation (IIS-1942591 and DGE-2146752). We thank Jacob Lusk and Lucy Li for insightful discussion and feedback.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Measuring Diversity in Hollywood through the Large-Scale Computational Analysis of Film</title>
		<author>
			<persName><forename type="first">D</forename><surname>Bamman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Samberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">J</forename><surname>So</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Zhou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the National Academy of Sciences</title>
				<meeting>the National Academy of Sciences</meeting>
		<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">Reframing Screen Performance</title>
		<author>
			<persName><forename type="first">C</forename><surname>Baron</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Carnicke</surname></persName>
		</author>
		<idno type="DOI">10.3998/mpub.104480</idno>
		<imprint>
			<date type="published" when="2008">2008</date>
			<publisher>University of Michigan Press</publisher>
			<pubPlace>Ann Arbor, MI</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Automatic Classification of Literature Pieces by Emotion Detection: A Study on Quevedo&apos;s Poetry</title>
		<author>
			<persName><forename type="first">L</forename><surname>Barros</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rodriguez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ortigosa</surname></persName>
		</author>
		<idno type="DOI">10.1109/acii.2013.30</idno>
	</analytic>
	<monogr>
		<title level="m">Humaine Association Conference on Affective Computing and Intelligent Interaction</title>
				<imprint>
			<date type="published" when="2013">2013. 2013</date>
			<biblScope unit="page" from="141" to="146" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Intensified Continuity Visual Style in Contemporary American Film</title>
		<author>
			<persName><forename type="first">D</forename><surname>Bordwell</surname></persName>
		</author>
		<idno type="DOI">10.1525/fq.2002.55.3.16</idno>
	</analytic>
	<monogr>
		<title level="j">Film Quarterly</title>
		<imprint>
			<biblScope unit="volume">55</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="16" to="28" />
			<date type="published" when="2002">2002</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">IEMOCAP: Interactive Emotional Dyadic Motion Capture Database</title>
		<author>
			<persName><forename type="first">C</forename><surname>Busso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Bulut</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C.-C</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kazemzadeh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Mower</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">N</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">S</forename><surname>Narayanan</surname></persName>
		</author>
		<idno type="DOI">10.1007/s10579-008-9076-6</idno>
	</analytic>
	<monogr>
		<title level="j">Language Resources and Evaluation</title>
		<imprint>
			<biblScope unit="volume">42</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="335" to="359" />
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">CREMA-D: Crowd-sourced Emotional Multimodal Actors Dataset</title>
		<author>
			<persName><forename type="first">H</forename><surname>Cao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">G</forename><surname>Cooper</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">K</forename><surname>Keutmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">C</forename><surname>Gur</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Nenkova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Verma</surname></persName>
		</author>
		<idno type="DOI">10.1109/taffc.2014.2336244</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE transactions on affective computing</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="377" to="390" />
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Mapping the Passions: Toward a High-Dimensional Taxonomy of Emotional Experience and Expression</title>
		<author>
			<persName><forename type="first">A</forename><surname>Cowen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Sauter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">L</forename><surname>Tracy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Keltner</surname></persName>
		</author>
		<idno type="DOI">10.1177/1529100619850176</idno>
	</analytic>
	<monogr>
		<title level="j">Psychological Science in the Public Interest</title>
		<imprint>
			<biblScope unit="volume">20</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="69" to="90" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">GoEmotions: A Dataset of Fine-Grained Emotions</title>
		<author>
			<persName><forename type="first">D</forename><surname>Demszky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Movshovitz-Attias</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Cowen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Nemade</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ravi</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2005.00547</idno>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">An Argument for Basic Emotions</title>
		<author>
			<persName><forename type="first">P</forename><surname>Ekman</surname></persName>
		</author>
		<idno type="DOI">10.1080/02699939208411068</idno>
	</analytic>
	<monogr>
		<title level="j">Cognition and Emotion</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="issue">3-4</biblScope>
			<biblScope unit="page" from="169" to="200" />
			<date type="published" when="1992">1992</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">A Quantitative Close Analysis of a Theatre Video Recording</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">Escobar</forename><surname>Varela</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">O F</forename><surname>Parikesit</surname></persName>
		</author>
		<idno type="DOI">10.1093/llc/fqv069</idno>
	</analytic>
	<monogr>
		<title level="j">Digital Scholarship in the Humanities</title>
		<imprint>
			<biblScope unit="volume">32</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="276" to="283" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Computationally Deconstructing Movie Narratives: An Informatics Approach</title>
		<author>
			<persName><forename type="first">T</forename><surname>Guha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Kumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">S</forename><surname>Narayanan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">L</forename><surname>Smith</surname></persName>
		</author>
		<idno type="DOI">10.1109/icassp.2015.7178374</idno>
	</analytic>
	<monogr>
		<title level="m">2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</title>
				<imprint>
			<biblScope unit="volume">2015</biblScope>
			<biblScope unit="page" from="2264" to="2268" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Emotion Dynamics in Movie Dialogues</title>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">E</forename><surname>Hipson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M</forename><surname>Mohammad</surname></persName>
		</author>
		<idno type="DOI">10.1371/journal.pone.0256153</idno>
	</analytic>
	<monogr>
		<title level="j">Plos One</title>
		<imprint>
			<biblScope unit="volume">16</biblScope>
			<biblScope unit="issue">9</biblScope>
			<biblScope unit="page">e0256153</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<title level="m" type="main">Syuzhet: Extract Sentiment and Plot Arcs from Text</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">L</forename><surname>Jockers</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Movies Emotional Analysis Using Textual Contents</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">K</forename><surname>Kayhani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Meziane</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Chiky</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-030-51310-8\_19</idno>
	</analytic>
	<monogr>
		<title level="m">Natural Language Processing and Information Systems</title>
				<editor>
			<persName><forename type="first">E</forename><surname>Métais</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">F</forename><surname>Meziane</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Horacek</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Cimiano</surname></persName>
		</editor>
		<meeting><address><addrLine>Cham</addrLine></address></meeting>
		<imprint>
			<publisher>Springer International Publishing</publisher>
			<date type="published" when="2020">2020</date>
			<biblScope unit="volume">12089</biblScope>
			<biblScope unit="page" from="205" to="212" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Investigating the Relationship between Literary Genres and Emotional Plot Development</title>
		<author>
			<persName><forename type="first">E</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Padó</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Klinger</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/W17-2203</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature</title>
				<meeting>the Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature<address><addrLine>Vancouver, Canada</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="17" to="26" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<title level="m" type="main">Kuleshov on Film: Writings</title>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">V</forename><surname>Kuleshov</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1974">1974</date>
			<publisher>University of California Press</publisher>
			<pubPlace>Berkeley</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Birth of the Cool: A Two-Centuries Decline in Emotional Expression in Anglophone Fiction</title>
		<author>
			<persName><forename type="first">O</forename><surname>Morin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Acerbi</surname></persName>
		</author>
		<idno type="DOI">10.1080/02699931.2016.1260528</idno>
	</analytic>
	<monogr>
		<title level="j">Cognition &amp; Emotion</title>
		<imprint>
			<biblScope unit="volume">31</biblScope>
			<biblScope unit="issue">8</biblScope>
			<biblScope unit="page" from="1663" to="1675" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<title level="m" type="main">Acting in the Cinema</title>
		<author>
			<persName><forename type="first">J</forename><surname>Naremore</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1988">1988</date>
			<publisher>University of California Press</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Style and Medium in the Moving Pictures</title>
		<author>
			<persName><forename type="first">E</forename><surname>Panovsky</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Film, an Anthology / Compiled and</title>
				<editor>
			<persName><forename type="first">Daniel</forename><surname>Talbot</surname></persName>
		</editor>
		<meeting><address><addrLine>New York</addrLine></address></meeting>
		<imprint>
			<publisher>Simon and Schuster</publisher>
			<date type="published" when="1959">1959</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Cutting across the Century: An Investigation of the Close up and the Long-Shot in &quot;Cine Choreography&quot; since the Invention of the Camera</title>
		<author>
			<persName><forename type="first">K</forename><surname>Pendlebury</surname></persName>
		</author>
		<idno type="DOI">10.18061/ijsd.v4i0.4527</idno>
	</analytic>
	<monogr>
		<title level="j">The International Journal of Screendance</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<monogr>
		<title level="m" type="main">Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings</title>
		<author>
			<persName><forename type="first">L</forename><surname>Pepino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Riera</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Ferrer</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2104.03502</idno>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note>cs, eess</note>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<title level="m" type="main">Moving Viewers: American Film and the Spectator&apos;s Experience</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">R</forename><surname>Plantinga</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2009">2009</date>
			<publisher>University of California Press</publisher>
			<pubPlace>Berkeley</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<title level="m" type="main">Emotion Analysis in NLP: Trends, Gaps and Roadmap for Future Directions</title>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">M</forename><surname>Plaza-Del-Arco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Curry</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">C</forename><surname>Curry</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Hovy</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2403.01222</idno>
		<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
	<note>cs</note>
</biblStruct>

<biblStruct xml:id="b23">
	<monogr>
		<title level="m" type="main">MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations</title>
		<author>
			<persName><forename type="first">S</forename><surname>Poria</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Hazarika</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Majumder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Naik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Cambria</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Mihalcea</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1810.02508</idno>
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
	<note>cs</note>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">On the Use of Computable Features for Film Classification</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Rasheed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Sheikh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Shah</surname></persName>
		</author>
		<idno type="DOI">10.1109/tcsvt.2004.839993</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Circuits and Systems for Video Technology</title>
		<imprint>
			<biblScope unit="volume">15</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="52" to="64" />
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">The Emotional Arcs of Stories Are Dominated by Six Basic Shapes</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">J</forename><surname>Reagan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Mitchell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Kiley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">M</forename><surname>Danforth</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">S</forename><surname>Dodds</surname></persName>
		</author>
		<idno type="DOI">10.1140/epjds/s13688-016-0093-1</idno>
	</analytic>
	<monogr>
		<title level="j">EPJ Data Science</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="1" to="12" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">A Circumplex Model of Affect</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>Russell</surname></persName>
		</author>
		<idno type="DOI">10.1037/h0077714</idno>
	</analytic>
	<monogr>
		<title level="j">Journal of Personality and Social Psychology</title>
		<imprint>
			<biblScope unit="volume">39</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page" from="1161" to="1178" />
			<date type="published" when="1980">1980</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">Emotional Sentence Annotation Helps Predict Fiction Genre</title>
		<author>
			<persName><forename type="first">S</forename><surname>Samothrakis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Fasli</surname></persName>
		</author>
		<idno type="DOI">10.1371/journal.pone.0141922</idno>
	</analytic>
	<monogr>
		<title level="j">Plos One</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="issue">11</biblScope>
			<biblScope unit="page">e0141922</biblScope>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Plot Arceology: A Vector-Space Model of Narrative Structure</title>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">M</forename><surname>Schmidt</surname></persName>
		</author>
		<idno type="DOI">10.1109/BigData.2015.7363937</idno>
	</analytic>
	<monogr>
		<title level="m">2015 IEEE International Conference on Big Data (Big Data)</title>
				<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="1667" to="1672" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">TRILLsson: Distilled Universal Paralinguistic Speech Representations</title>
		<author>
			<persName><forename type="first">J</forename><surname>Shor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Venugopalan</surname></persName>
		</author>
		<idno type="DOI">10.21437/Interspeech.2022-118</idno>
		<idno type="arXiv">arXiv:2203.00236</idno>
	</analytic>
	<monogr>
		<title level="m">Interspeech</title>
				<imprint>
			<date type="published" when="2022">2022. 2022</date>
			<biblScope unit="page" from="356" to="360" />
		</imprint>
	</monogr>
	<note>cs, eess</note>
</biblStruct>

<biblStruct xml:id="b30">
	<analytic>
		<title level="a" type="main">From Louvain to Leiden: Guaranteeing Well-Connected Communities</title>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">A</forename><surname>Traag</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Waltman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">J</forename><surname>Van Eck</surname></persName>
		</author>
		<idno type="DOI">10.1038/s41598-019-41695-z</idno>
	</analytic>
	<monogr>
		<title level="j">Scientific Reports</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page">5233</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<monogr>
		<title level="m" type="main">The Emotion Dynamics of Literary Novels</title>
		<author>
			<persName><forename type="first">K</forename><surname>Vishnubhotla</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Hammond</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Hirst</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M</forename><surname>Mohammad</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2403.02474</idno>
		<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
	<note>cs</note>
</biblStruct>

<biblStruct xml:id="b32">
	<analytic>
		<title level="a" type="main">Levels of Achievement in Acting</title>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">B</forename><surname>Wilson</surname></persName>
		</author>
		<idno type="DOI">10.2307/3204063</idno>
		<idno>JSTOR: 3204063</idno>
	</analytic>
	<monogr>
		<title level="j">Educational Theatre Journal</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="230" to="236" />
			<date type="published" when="1951">1951</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b33">
	<analytic>
		<title level="a" type="main">Unpacking Gender Stereotypes in Film Dialogue</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Hao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Dhillon</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-031-19097-1\_26</idno>
	</analytic>
	<monogr>
		<title level="m">Social Informatics</title>
				<editor>
			<persName><forename type="first">F</forename><surname>Hopfgartner</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K</forename><surname>Jaidka</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Mayr</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Jose</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Breitsohl</surname></persName>
		</editor>
		<meeting><address><addrLine>Cham</addrLine></address></meeting>
		<imprint>
			<publisher>Springer International Publishing</publisher>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="398" to="405" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b34">
	<monogr>
		<title level="m" type="main">Emotion Detection on TV Show Transcripts with Sequence-based Convolutional Neural Networks</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M</forename><surname>Zahiri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">D</forename><surname>Choi</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.1708.04299</idno>
		<idno type="arXiv">arXiv:1708.04299</idno>
		<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b35">
	<analytic>
		<title level="a" type="main">M3ED: Multi-modal Multiscene Multi-label Emotional Dialogue Database</title>
		<author>
			<persName><forename type="first">J</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Jin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Li</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2022.acl-long.391</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics</title>
		<title level="s">Long Papers</title>
		<editor>
			<persName><forename type="first">S</forename><surname>Muresan</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Nakov</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Villavicencio</surname></persName>
		</editor>
		<meeting>the 60th Annual Meeting of the Association for Computational Linguistics<address><addrLine>Dublin, Ireland</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="5699" to="5710" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b36">
	<analytic>
		<title level="a" type="main">Evaluation and Alignment of Movie Events Extracted via Machine Learning from a Narratological Perspective</title>
		<author>
			<persName><forename type="first">F</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Pianzola</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2023 Computational Humanities Research Conference</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<meeting><address><addrLine>CHR</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">2023. 2023</date>
			<biblScope unit="page" from="49" to="62" />
		</imprint>
	</monogr>
	<note>CEUR-WS. org</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
