<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">LTUHH@STSS: Applying Coreference to Literary Scene Segmentation</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Hans</forename><forename type="middle">Ole</forename><surname>Hatzel</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Language Technology Group</orgName>
								<orgName type="institution">Universität Hamburg</orgName>
								<address>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Chris</forename><surname>Biemann</surname></persName>
							<affiliation key="aff1">
								<orgName type="department">Language Technology Group</orgName>
								<orgName type="institution">Universität Hamburg</orgName>
								<address>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">LTUHH@STSS: Applying Coreference to Literary Scene Segmentation</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">807C09E6CBB17F3EE76D9F613B5F3837</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T14:04+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In this work, we describe a system for scene segmentation that, relying on character constellations as one of the defining characteristics of scenes, employs a state-of-the-art coreference system. Conceptually building on one of the presented baseline systems, we use a transformer model, enhanced with additional coreference-based features, to identify scene boundaries on the basis of sentence pairs. Finding one of our system's core weaknesses to lie in its local decision making, we adapt an equidistance constraint, avoiding the common error of predicting very short scenes that in many cases only cover a single sentence. We show that coreference is a suitable feature for scene segmentation and experiment with dynamic programming approaches for non-local decisions. This work is a submission for the shared task scene segmentation (STSS) held at KONVENS 2021, where task participants were asked to, given annotated training data, build systems that split novels into scenes: segments narrating a coherent action in one location with the same characters. Our system ranks 4/4 and 4/5 in Track 1 and Track 2, respectively.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>One of the most defining characteristics of scenes are character constellations, in this work we describe a scene segmentation system exploiting this characteristic. Other defining aspects of scenes such as the story and discourse time being equal and the fact that they contain a coherent sequence of actions will not be explicitly modeled in this work. The shared task scene segmentation hosted by <ref type="bibr" target="#b11">Zehe et al. (2021b)</ref> provides training data in the form of 22 dime novels, with an additional (for the task duration) unpublished test set and a single trial document. We chose a transformer-based approach as a starting point; we use BERT <ref type="bibr" target="#b3">(Devlin et al., 2019)</ref> for scene segmentation, following the general approach of the best baseline proposed by <ref type="bibr" target="#b10">(Zehe et al., 2021a)</ref>. Further, we enrich the BERTbased representation using two sets of features, (a) a coreference-based approach to finding the characters in a given scene and (b) a set of surface features we believe may be helpful. In a second step, we improve our model's results by adding non-local decisions in the form of a cost function optimized using a dynamic programming technique.</p><p>2 Related Work <ref type="bibr" target="#b8">Pethe et al. (2020)</ref> approach the task of chapter segmentation, the task of splitting a document into its chapters. This task is related to scene segmentation in that it operates on a similar domain. As we conjecture, chapter boundaries may also correspond with changes in location or characters, making this work more relevant still. <ref type="bibr" target="#b8">Pethe et al. (2020)</ref> take an equidistant approach to chapter segmentation, thereby enhancing local decisions with the knowledge that chapter boundaries tend to be somewhat evenly placed throughout a novel. The equidistant approach is applied by minimizing the following equation:</p><formula xml:id="formula_0">cost(n,k )=min i∈[0,n−1] cost(i,k−1)+(1−α) |n−i| L −α•sn</formula><p>Where k is the number of breaks to be inserted, n the position at which to insert a break and L the target length of each segment. α is a hyperparameter controlling the impact of the local boundary score s n with values approaching one placing more importance on local decisions.</p><p>In our previous work <ref type="bibr" target="#b9">(Schröder et al., 2021)</ref>, we trained state-of-the-art models for coreference resolution on German data. Following the coarseto-fine inference architecture for coreference <ref type="bibr" target="#b7">(Lee et al., 2018)</ref>, we fine-tune transformer models on the German TüBa-D/Z dataset, adapting them to the literature domain using further fine-tuning on the DROC dataset <ref type="bibr" target="#b6">(Krug et al., 2018)</ref>. While some of our models enable the handling of arbitrary length texts, in this work we only rely on the coarseto-fine model the application of which, due to its memory requirement characteristics, is limited to shorter documents.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Model and Features</head><p>In order to maximize the contextual information input to BERT, we do not pass an explicit context in conjunction with the two sentences in question (unlike the baseline approach in <ref type="bibr" target="#b10">Zehe et al., 2021a)</ref>. Instead, our approach follows the Next Sentence Prediction (NSP) training objective in BERT. For each sentence boundary present in the input data, we predict if the sentence to either side is part of the same scene or if there is a boundary between them (i.e. we perform a binary classification for the input "[CLS] scene candidate a [SEP] scene canidate b [SEP]"). Note that in the context of the NSP task, "sentence" actually refers to any input sequence and not a sentence in the linguistic sense. We see this alignment with the NSP as a benefit of our system, enabling us to leverage more of BERT's pre-trained capabilities. For this reason, we also chose to use a BERT model rather than an Electra model <ref type="bibr" target="#b2">(Clark et al., 2020)</ref>, as Electra models are not trained on the NSP objective.</p><p>While we did experiment with a BERT model trained on German literary data<ref type="foot" target="#foot_1">1</ref> , we did not find success with it which, we attributed to the fact that it is fine-tuned on named entity recognition and may have, in a case of catastrophic forgetting, lost the ability to perform the NSP task. While the coreference-based features rely on previous work of ours <ref type="bibr" target="#b9">(Schröder et al., 2021)</ref>, for all of the remaining feature extraction we used the "de core news lg" model in spaCy <ref type="bibr" target="#b5">(Honnibal et al., 2020)</ref>. All features are passed into a linear layer with GELU activation function <ref type="bibr" target="#b4">(Hendrycks and Gimpel, 2020)</ref> in conjunction with the pooled BERT output (i.e. the [CLS] token's embedding). Final predictions are made using individual linear layers for each of the three outputs: binary scene type labels for each of the two sequences and the binary decision of whether there is a scene boundary between them, each with sigmoid activation functions. The model is trained using SGD and binary-cross-entropy loss for each of the three labels, using class weighting based on the training data distribution.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Coreference Features</head><p>Leveraging coreference features we seek to model one of the central components of scenes: the character constellations. To this end, we pass the number of unique characters appearing in each of the input sequences, together with the number of unique characters appearing in both sequences to the model.</p><p>Taking a more global approach to coreference would also be possible, in this case, the number of characters involved in the current context may be compared to the global number of characters. While this approach may yield further improvements, we did not test it, partly due to the fact that global coreference resolution for long documents still is much more susceptible to errors than local approaches <ref type="bibr" target="#b9">(Schröder et al., 2021)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Named Entity Recognition Features</head><p>One feature that we, following manual inspection of the training data, expect to be predictive of scene boundaries are named entities. The explicit mention of characters as well as that of locations should indicate a scene change. We extract the named entity tags for persons, locations, and miscellaneous entities and use document-length-normalized counts of each of them as a model input. While the coreference features capture some similar information, they capture neither location mentions nor are they able to differentiate between explicit and anaphoric character mentions. Using a NER system trained specifically on literary data could help this step, such data is available in the DROC dataset <ref type="bibr" target="#b6">(Krug et al., 2018)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Surface Features</head><p>In an effort to improve our model, we added a set of surface features that we believed may be indicative of scene changes. We passed the number of tokens (including special characters such as quotes and punctuation) fulfilling different properties to our model • being punctuation • being uppercased • being quotation marks • being a stop word • being the start of a sentence While all these features could, in principle, be picked up by means of representation learning in our neural model, we still add them due to the relatively small number of training samples.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Intermediate Results</head><p>While, in principle, our model is capable of predicting both scene boundaries and scene types, our final system uses two distinct models with the same architecture and inputs for the two tasks. Joint training presents non-trivial challenges in balancing the two target objectives but may yield improvements in final results. Both models were trained with early stopping on the trial data (i.e. one document provided with the task description but not as part of the training data); a hyperparameter search for individual learning rates for the final layers (between 1 × 10 −3 and 1 × 10 −5 ) and the BERT model (between 1 × 10 −4 and 2 × 10 −5 ) was performed using the Tree-structured Parzen Estimator <ref type="bibr" target="#b1">(Bergstra et al., 2011)</ref> implementation by <ref type="bibr" target="#b0">Akiba et al. (2019)</ref>. The final model for scene types stopped after 5000 (returning to the set of weights from step 2000) steps of batch size 24 (with an evaluation frequency of 1000 steps) and used a learning rate of 9.9 × 10 −5 for BERT and 6.4 × 10 −4 for the final layers. The final model for scene types stopped after 18 000 (returning to the set of weights from step 15 000) steps of batch size 24 (with an evaluation frequency of 1000 steps) and used a learning rate of 4.8 × 10 −5 for BERT and 2.84 × 10 −5 for the final layers.</p><p>Using the features described so far we reach an F1-score of 33.7 on the task's trial document 2 , presumably already outperforming the baseline system. Figure <ref type="figure">1</ref> illustrates the predicted boundaries together with the networks output values for each of the potential scene splits, i.e. each pair of sentences. Notably, there are multiple cases of two or more directly adjacent instances of false positives. Sometimes, like at the very end of the document, in conjunction with a true positive boundary. This illustrates what we see as a key weakness of our initial model; since decisions are purely local, when in doubt about the placement, the model creates multiple boundaries where one would be sufficient.</p><p>2 Unless otherwise specified F1-score refers to the boundary class's F1-score throughout this document</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Non-Local Model</head><p>As discussed in Section 4 we see an issue in the local nature of scene segmentation boundaries. One approach to remedy this may be, training on sequences of adjacent sentence pairs; this would have the advantage of allowing for non-local decisions, informed by any part of neighboring inputs. At the same time, however, this increases the memory requirements, and with scene boundaries occurring about every 43 sentences on average, a large enough context may (depending on available GPU memory) be infeasible to jointly train. Our early approaches instead focused on using neural sequence models on local decision outputs but using this approach we did not manage to improve upon local-decision-based results.</p><p>Instead, we chose a purely algorithmic approach without training: the dynamic programming (DP) approach by <ref type="bibr" target="#b8">Pethe et al. (2020)</ref>, a technique that requires prior knowledge of the number of chapters, or in our case scene, boundaries. Applying their approach to the task's trial document which was held-out, given the correct number of scene boundaries, (with α = 0.9) results in an F1-score of 39.1. This represents is an improvement of around 5.4 on the local F1-score of 33.7. For comparison, when only using the k highest confidence values, where k is the number of gold boundaries, we only get an F1-Score of 34.8, illustrating that the mere knowledge of the number of scenes is not as impactful.</p><p>Figure <ref type="figure">2</ref> shows the effect the cost function can have on decisions, while α = 0.7 actually entails a worse F1-Score, the effect is very subtle when using larger α values (i.e. when incorporating local decisions to a larger extent).</p><p>Figure <ref type="figure" target="#fig_1">3</ref> illustrates that the coefficient of variation (CV) for the shared task's scene boundary is much higher than it is for the chapter data in the work by <ref type="bibr" target="#b8">Pethe et al. (2020)</ref>, where the distribution is centered around a value below 0.5. This can be interpreted as the length of chapters inside most documents being less variable than the length of scenes in many documents in our dataset. Although it is to be noted that the two statistics are made on the basis of very different datasets. The standard deviation of the distribution of average per document scene lengths (in sentences) is 10.84 with a mean of 45.3 and, accordingly, a CV of 0.24.</p><p>Another very simple approach to using non-local information is to, in a fixed window, only consider the top value to actually constitute a boundary. For  this, we walk across the boundary candidates and, in a fixed-sized window, set the boundary class to zero for all but the largest value in the window. With a window size of five, for example, this means that no candidate with larger confidence values in its four neighbors (two to either side) will be predicted. Using this simple strategy, however, we adversely impact the quality of our predictions, going from an F1-Score of 33.7 to one of 27.8.</p><p>The improvements attained by application of the DP technique by <ref type="bibr" target="#b8">Pethe et al. (2020)</ref> in combina-tion with the variance of 0.74 in the task's trial document illustrate just how important non-local information is to improving performance in this task. Further work on neural sequence models may yield significant improvements.</p><p>Our final model uses the DP approach by <ref type="bibr" target="#b8">Pethe et al. (2020)</ref> with α = 0.8, a strong focus on local values. As explicitly stated in their paper, this method assumes knowledge of the actual number of boundaries, which is not the case for our data. We apply the heuristic of assuming the number of actual boundaries to be equal to the number of locally predicted boundaries. This way our the nonlocal approach effectively only moves the positions at which splits happen but does not change their total number. Unsurprisingly, given the variance in scene lengths, we found this to outperform the heuristic of dividing the text length by the average scene length. Further, we adapt the cost function to be more lenient with regard to scenes shorter than the average, as long as they are not too short.</p><p>Figure <ref type="figure" target="#fig_2">4</ref> shows how we adapt the equidistant constraint by <ref type="bibr" target="#b8">Pethe et al. (2020)</ref> to punish very short distances. Where their cost function is linear in both directions, we adapt it to only punish very </p><formula xml:id="formula_1">−log(x + 1) • 1 β<label>(1)</label></formula><p>For this, we apply the cost function in Equation <ref type="formula" target="#formula_1">1</ref>to negative distances relative to the target distance L, β is a hyperparameter controlling how close to a distance of zero very large costs set in; we use β = 2. For positive distances, we use x 2 effectively increasing the inherent α but also changing the relation of long distances to short ones.</p><p>Evaluating the same technique on our training data yielded a marginal improvement of around 0.01 F1, this is to be expected as some memorization of training samples should lead to improved local decisions. This result does give us confidence the approach will not adversely impact test set performance.</p><p>While, after optimizing alpha on the held-out data, the equidistant cost function performed on par with our cost function on the same data, when adapting to the training data (on which our α value was not optimized) the equidistant function only increased performance by 0.003 F1.</p><p>Further analysis is needed to provide a clear picture of cost function's impact on unseen data. It however already seems plausible that our adaptation of the cost function presents an improvement over the equidistant cost function.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Conclusion and Final Results</head><p>We present an approach to scene segmentation that relies on character information. While we do not produce irrefutable evidence of its advantages, we propose a cost function more suitable to the needs of scene segmentation, adapting the work by <ref type="bibr" target="#b8">Pethe et al. (2020)</ref> to a new task.</p><p>On the official evaluation metric we only reach an F1-score of 0.02 for Track 1 and an F1-score of 0.11 for Track 2. These are below the boundary class performance discussed earlier as they include the correct classification of scene types. With out system focusing mostly on the placement of scene boundaries it could potentially be extended with features more suitable for scene classification.</p><p>The system performs relatively poorly in Track 1, reaching the last place with quite a margin to the next system, but much better in Track 2 where it is close behind the third-placed system, what exactly causes this difference in performance remains unclear. We stay far behind the performance of the top-scoring systems but coreference seems to be a salient feature that may be useful to include in future systems.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :Figure 2 :</head><label>12</label><figDesc>Figure 1: Positions of scene splits in the trial data using only local decisions</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: The coefficient of variation in scene lengths for each individual document in the training data.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: The cost associated with deviation from the target distance L, where a deviation of −L is equivalent to a boundary distance of zero</figDesc></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0">Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_1">https://huggingface.co/ severinsimmler/literary-german-bert</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Optuna: A nextgeneration hyperparameter optimization framework</title>
		<author>
			<persName><forename type="first">Takuya</forename><surname>Akiba</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Shotaro</forename><surname>Sano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Toshihiko</forename><surname>Yanase</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Takeru</forename><surname>Ohta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Masanori</forename><surname>Koyama</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</title>
				<meeting>the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining<address><addrLine>Anchorage, Alaska, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="2623" to="2631" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Algorithms for hyper-parameter optimization</title>
		<author>
			<persName><forename type="first">James</forename><surname>Bergstra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Rémi</forename><surname>Bardenet</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yoshua</forename><surname>Bengio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Balázs</forename><surname>Kégl</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems</title>
				<meeting><address><addrLine>Granada, Spain</addrLine></address></meeting>
		<imprint>
			<publisher>Curran Associates, Inc</publisher>
			<date type="published" when="2011">2011</date>
			<biblScope unit="volume">24</biblScope>
			<biblScope unit="page" from="469" to="477" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">ELECTRA: Pretraining text encoders as discriminators rather than generators</title>
		<author>
			<persName><forename type="first">Kevin</forename><surname>Clark</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Minh-Thang</forename><surname>Luong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Quoc</forename><forename type="middle">V</forename><surname>Le</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Christopher</forename><forename type="middle">D</forename><surname>Manning</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Learning Representations</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">BERT: Pre-training of deep bidirectional transformers for language understanding</title>
		<author>
			<persName><forename type="first">Jacob</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ming-Wei</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kenton</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kristina</forename><surname>Toutanova</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
		<title level="s">Long and Short Papers</title>
		<meeting>the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies<address><addrLine>Minneapolis, Minnesota, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2019">2019</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="4171" to="4186" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<author>
			<persName><forename type="first">Dan</forename><surname>Hendrycks</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kevin</forename><surname>Gimpel</surname></persName>
		</author>
		<idno>arxiv:1606.08415</idno>
		<title level="m">Gaussian error linear units (GELUs)</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
	<note>Computing Research Repository. Version 4</note>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<author>
			<persName><forename type="first">Matthew</forename><surname>Honnibal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ines</forename><surname>Montani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sofie</forename><surname>Van Landeghem</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Adriane</forename><surname>Boyd</surname></persName>
		</author>
		<ptr target="https://github.com/explosion/spaCy/tree/v3.1.1" />
		<title level="m">spaCy: Industrial-strength Natural Language Processing in Python</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Description of a corpus of character references in German novels-DROC [Deutsches ROman Corpus</title>
		<author>
			<persName><forename type="first">Markus</forename><surname>Krug</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lukas</forename><surname>Weimer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Isabella</forename><surname>Reger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Luisa</forename><surname>Macharowsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Stephan</forename><surname>Feldhaus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Frank</forename><surname>Puppe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Fotis</forename><surname>Jannidis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">DARIAH-DE Working Papers</title>
		<imprint>
			<biblScope unit="volume">27</biblScope>
			<biblScope unit="page" from="1" to="16" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Higher-order coreference resolution with coarse-tofine inference</title>
		<author>
			<persName><forename type="first">Kenton</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Luheng</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Luke</forename><surname>Zettlemoyer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
				<meeting>the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies<address><addrLine>New Orleans, Louisiana, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2018">2018</date>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page" from="687" to="692" />
		</imprint>
	</monogr>
	<note>Short Papers</note>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Chapter Captor: Text Segmentation in Novels</title>
		<author>
			<persName><forename type="first">Charuta</forename><surname>Pethe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Allen</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Steve</forename><surname>Skiena</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)</title>
		<title level="s">Online. Association for Computational Linguistics</title>
		<meeting>the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="8373" to="8383" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Neural end-to-end coreference resolution for German in different domains</title>
		<author>
			<persName><forename type="first">Fynn</forename><surname>Schröder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ole</forename><surname>Hans</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Chris</forename><surname>Hatzel</surname></persName>
		</author>
		<author>
			<persName><surname>Biemann</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 17th Conference on Natural Language Processing</title>
				<meeting>the 17th Conference on Natural Language Processing<address><addrLine>Düsseldorf, Germany</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Detecting scenes in fiction: A new segmentation task</title>
		<author>
			<persName><forename type="first">Albin</forename><surname>Zehe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Leonard</forename><surname>Konle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lea</forename><surname>Katharina Dümpelmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Evelyn</forename><surname>Gius</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Andreas</forename><surname>Hotho</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Fotis</forename><surname>Jannidis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lucas</forename><surname>Kaufmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Markus</forename><surname>Krug</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Frank</forename><surname>Puppe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nils</forename><surname>Reiter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Annekea</forename><surname>Schreiber</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nathalie</forename><surname>Wiedmer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume</title>
				<meeting>the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume</meeting>
		<imprint>
			<date type="published" when="2021">2021a</date>
			<biblScope unit="page" from="3167" to="3177" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Shared task on scene segmentation@konvens2021</title>
		<author>
			<persName><forename type="first">Albin</forename><surname>Zehe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Leonard</forename><surname>Konle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Svenja</forename><surname>Guhr</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lea</forename><surname>Katharina Dümpelmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Evelyn</forename><surname>Gius</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Andreas</forename><surname>Hotho</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Fotis</forename><surname>Jannidis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lucas</forename><surname>Kaufmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Markus</forename><surname>Krug</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Frank</forename><surname>Puppe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nils</forename><surname>Reiter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Annekea</forename><surname>Schreiber</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Shared Task on Scene Segmentation</title>
				<imprint>
			<date type="published" when="2021">2021b</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
