<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Temporal Embeddings and Transformer Models for Narrative Text Understanding</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Simone</forename><surname>Mellace</surname></persName>
							<email>simone@idsia.ch</email>
							<affiliation key="aff0">
								<orgName type="department">Istituto Dalle Molle di Studi sull&apos;Intelligenza Artificiale (IDSIA)</orgName>
								<address>
									<settlement>Lugano</settlement>
									<country key="CH">Switzerland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Alessandro</forename><surname>Antonucci</surname></persName>
							<email>alessandro@idsia.ch</email>
							<affiliation key="aff0">
								<orgName type="department">Istituto Dalle Molle di Studi sull&apos;Intelligenza Artificiale (IDSIA)</orgName>
								<address>
									<settlement>Lugano</settlement>
									<country key="CH">Switzerland</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Temporal Embeddings and Transformer Models for Narrative Text Understanding</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">459AF3ED7688ABAF83661842A0AC73AB</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T14:12+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>We present two deep learning approaches to narrative text understanding for character relationship modelling. The temporal evolution of these relations is described by dynamic word embeddings, that are designed to learn semantic changes over time. An empirical analysis of the corresponding character trajectories shows that such approaches are e↵ective in depicting dynamic evolution. A supervised learning approach based on the state-of-the-art transformer model BERT is used instead to detect static relations between characters. The empirical validation shows that such events (e.g., two characters belonging to the same family) might be spotted with good accuracy, even when using automatically annotated data. This provides a deeper understanding of narrative plots based on the identification of key facts. Standard clustering techniques are finally used for character de-aliasing, a necessary pre-processing step for both approaches. Overall, deep learning models appear to be suitable for narrative text understanding, while also providing a challenging and unexploited benchmark for general natural language understanding.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Due to the inherent complexity involved in textual data, narrative text understanding remains a challenging and relatively unexplored research area for AI. Here we consider narrative text, such as novels and short stories (broadly termed here as literary text) and try to address its lexical diversity and richness in terms of relations between entities [PAHS + 17]. In recent years, Deep Learning (DL) approaches were found to positively impact Natural Language Processing (NLP) with impressive boosts in text extraction and understanding capabilities. This marginally concerns the area of literary text <ref type="bibr" target="#b16">[LB19]</ref>, where the application of DL models remains relatively unexplored. Some researchers modelled character networks using machine learning, mostly from a social network perspective based on generative models for conversational dialogues [ACJR12, CHTH + 10] not involving DL state-of-the-art approaches. Just a few works have been reported for character evolution and relational analysis [CSDID16, <ref type="bibr" target="#b12">KA19,</ref><ref type="bibr" target="#b27">VKA20]</ref>.</p><p>Here we evaluate the application of DL to literary text understanding. The goal is to describe character relationships within a novel and their evolution. Moreover, we also want to emphasize the potential of literary text as a challenging benchmark for state-of-the-art language models, whose major applications are typically in other domains such as biomedical literature <ref type="bibr" target="#b18">[LZFJ17]</ref> or fake news detection <ref type="bibr" target="#b25">[RSL17]</ref>, where both the lexical richness and the intricacy of the inter-entities relations might be less intricate compared to literary domain.</p><p>To analyse the character relationships, both supervised and unsupervised DL techniques are considered here. A classification model to identify the relations between characters using BERT (Bidirectional Encoder Representations from Transformers, <ref type="bibr" target="#b8">[DCLT19]</ref>) is trained from supervised data. BERT is successfully used in various classification tasks, but, to the best of our knowledge, not yet in the literary text domain. Moreover, manual annotation of training data in this field can be very expensive, this representing a strong limitation for this direction. To partially bypass this issue, here we also present a simple approach to automatically generate training data for character relation classification (focusing on family relations, such as parent of, sibling of ).</p><p>At the unsupervised level, we consider the dynamic evolution of the characters over time (i.e., across the text). To do that, we learn vectors associated to di↵erent characters based on so-called dynamic or temporal embeddings <ref type="bibr" target="#b3">[BM17]</ref>, allowing to learn vectors over di↵erent slices inside the text (e.g., chapters or fixed amounts of text), while maintaining the vectors comparable over time because of a common initialization. We analyse the relations between characters by visualizing the character trajectories over time by low-dimensionality projections or the relative distances in the original, high-dimensionality, spaces or by considering the relative distances between the vectors.</p><p>Both techniques require a pre-processing step consisting in character detection, based on standard entity recognition techniques, and character de-aliasing, for which density-based clustering methods are adopted.</p><p>The paper is organized as follows. A review of existing work is in Section 2. Sections 3 and 4 report a discussion of, respectively, the supervised and unsupervised approaches. An empirical validation is in Section 5. Conclusions and outlooks are finally reported in Section 6.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Literature Review</head><p>The onset of DL has given drive to powerful data processing models, which facilitate NLP applications. In this context, systems that understand the semantic and syntactic aspects of a text are extremely important. Word embedding models such as Word2Vec [MSC + 13], or Glove <ref type="bibr" target="#b24">[PSM14]</ref>, as well as sentence embedding models such as USE (Universal Sentence Encoder, [CYK + 18]) help in representing text as a mathematical object in a reliable way. The text representations using such embeddings along with NLP and deep neural networks [HS97, MMY + 16] played a vital role in text extraction, classification and clustering.</p><p>Another major shift was the introduction of attention models and transformers [VSP + 17], these are language models able to better understand text semantics by contextual analysis. BERT, ELMO [PNI + 18] and various versions of these models gave a big boost to recent NLP applications [HSG + 19]. Moreover, word embeddings, originally intended as static model of a given corpus, later led to the exploration of their dynamic evolution over time, this being mainly used to compare the semantic shifts of words over time and detection of word analogies <ref type="bibr" target="#b13">[KARPS15,</ref><ref type="bibr" target="#b15">KØSV18]</ref>. Notably, some of these works used BERT for story ending predictions and temporal event extractions <ref type="bibr" target="#b9">[HLAP19,</ref><ref type="bibr" target="#b17">LDL19]</ref>. In the next sections, we show how these models can be applied to literary text understanding.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Character Trajectories by Temporal Word Embeddings</head><p>Both the unsupervised technique presented in this section and the supervised approach discussed in Section 4 require a reliable identification of the characters involved in the plot. This corresponds to a named entity recognition task, for which standard tools can be used.<ref type="foot" target="#foot_0">1</ref> As same characters can occur in the text with di↵erent aliases (e.g., Ron and Ronald Weasley), a de-aliasing might be needed as an additional pre-processing step. We achieve that by a clustering of the named entities based on the DBSCAN algorithm <ref type="bibr" target="#b2">[BK07]</ref>. The entities are clustered using precomputed distances based on the sequence matcher algorithm, which finds the longest common subsequences.</p><p>After character identification and de-aliasing, learning the embeddings of the characters of a literary text is a straightforward task. As the learning of an embedding is based on contextual information, the only important condition is that a su cient amount of co-occurrences of the characters in the text is available. If this is the case, the relative distances between the vectors can be used as proxy indicators of the relations between the corresponding characters. This can be also achieved for separate parts of a same text (e.g., chapters), provided that the amount of text remains su cient for learning. In this way it is possible to capture the relations between characters for each part, but not to describe the dynamic evolution of the same character over the whole text. Vectors trained in di↵erent embeddings, even with the same dimensionality, are in fact not directly comparable.</p><p>The method employed in <ref type="bibr" target="#b7">[DCBP19]</ref> elegantly addresses this issue by aligning di↵erent temporal representations using a shared coordinate system. The model uses a skip-gram Word2Vec architecture, where the context matrix (the output weight matrix) is fixed during the training, while allowing the word embedding input weight matrices to change on the basis of co-occurrence frequencies that are specific to a given temporal interval. After training, model returns the context embeddings, that we are going to consider as a temporal word embedding. To achieve that, first, a static word embedding is trained with random initialization using the whole text and ignoring temporal slices.</p><p>Let us denote as W the corresponding word embedding matrix and as W 0 the corresponding context matrix. For each slice, we instead initialize the word embedding matrix with W while keeping W 0 as a frozen context matrix equal for all the time slices <ref type="bibr" target="#b7">[DCBP19]</ref>. This initialization has been proved to force alignment and make it possible to compare vectors from embeddings associated to di↵erent time slices. The architecture is depicted in Figure <ref type="figure" target="#fig_0">1</ref>. In particular, we adopt the dynamic initialization scheme proposed in <ref type="bibr" target="#b27">[VKA20]</ref>, which appears to be more suitable for narrative text because of its intrinsic sequential nature.</p><p>Dynamic embeddings, generally used for word analogies, are considered here to describe and interpret relations by means of the trajectories spanned by the vectors associated to di↵erent characters. The character embeddings are represented in a visual space by dimensionality reduction <ref type="bibr" target="#b19">[MH08]</ref> to understand the evolving relations between characters, in terms of time slices in a novel such as chapters or other parts of the text. This could be further related to character sentiments, clustering of emotions and other descriptions. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">BERT-based Classification of Character Relations</head><p>The unsupervised approach considered in the previous section describes the relation between characters in terms of relative positions of the corresponding vectors and their evolution. Here we consider a character relation extraction based on binary classifiers. This is a supervised approach based on a ground truth of annotated sentences where the two characters are identified together with a Boolean value expressing whether or not the relation under consideration is met. The character names or aliases are eventually replaced by anonymous placeholders, as this helps the model to learn the relationships by abstracting from the specific names.</p><p>For the learning phase, we use the BERT classification model. Its pre-trained model can be fine tuned for classification with an additional output layer. BERT has a wordpiece tokenizer using two special tokens (SEP and CLS), which are used to encode valuable information of sentence structure and semantics after fine-tuning.</p><p>The BERT-base has twelve transformer layers and in the classification task, the pooled token embedding from the CLS tokens is fed into a linear classifier for predictions. With the powerful attention mechanisms, BERT embeddings encode deep semantic and syntactic contextual information. The relation extraction problem is modelled as a single sentence classification task using BERT model. More details about this general architecture are in <ref type="bibr" target="#b8">[DCLT19]</ref>.</p><p>As creating ground truth in this field might be very expensive, we also discuss techniques for automatic data annotation. As an example, let us focus on family relations, where the problem is to decide whether or not there is familial relation between two characters. By increasing the neighbourhood parameter, the output of the DBSCAN clustering algorithm used for de-aliasing produces clusters in which the characters belonging to same family are together (as their second names remain same). E.g., in the Harry Potter books, we have clusters with the Potter and the Weasley family. These clusters are used for automatic creation of the positive samples in training data, while the remaining characters are used for negative sample generation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Experimental Analysis</head><p>The above discussed approaches to character relation modelling (Section 3) and understanding (Section 4) are validated here with two novels: Little Women (LW) by L.M. Alcott (text length 197'524 words) and the first six books of the Harry Potter series (HP) by J.K. Rowling (885'943 words).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Family Relationship Classification.</head><p>Due to the intricate nature of its plot and its length, HP is being often used as a benchmark for natural language understanding in literary domain [BDE + 16, Spa13]. As an application of the ideas discussed in Section 4, we consider the task of predicting whether or not a given pair of characters has a family relation or not. A BERT based classifier is used for that.<ref type="foot" target="#foot_1">2</ref> Out of six books, we use the sentences generated from five books as training set and the remaining book as a test according to a cross-validation scheme. For the training set, the automatic class labelling is done by creating clusters for the same family groups (see Section 4). The number of samples for each book is 160, 250, 239, 396, 478, and 231, the ratio of positive samples for each book being 30.0%, 39.2%, 28.9%, 38.6%, 62.6%, and 47.6%. BERT is used together with the Adam optimizer <ref type="bibr" target="#b14">[KB14]</ref>. This gives a learning rate equal to 2 • 10 5 , warm up equal to 0.1 and ten epochs. The results in Table <ref type="table" target="#tab_0">1</ref> show reasonably good average performances and their standard deviations over the six books. Note that the aggregated values, corresponding to weighted averages, might be higher than those for negative or positive samples only. A test is also done on the LW benchmark with the HP training data. A lower F-score level (64%) is obtained, possibly related to an over-fitting e↵ected. This might be relevant for literary texts, where the di↵erences between di↵erent data (e.g., di↵erent texts of di↵erent authors) are typically stronger than in other domains.</p><p>It is important to note we have implemented a classification model whose predictions are at the sentence level. When coping with character pairs, it would be more appropriate to consider a higher level, i.e, prediction with respect to all the sentences that express the character pair relation. This is achieve by a bag of sentence approach, where a character pair is considered to have a relation, if at least one of the sentences is predicted as positive. For HP there are 85 entity pairs (12 positive and 73 negative) and the results are 9 positive (75%) and 66 (90%) negative pairs correctly predicted. Considering the intrinsic complexity of literary text, where sentences might have very complex structures, this might be regarded as a promising result and also advocate our strategy for the automatic generation of training set.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Temporal Word Embeddings.</head><p>Following the discussion in Section 3, we train a temporal word embedding for the first six books of HP. We focus on the four characters which appear more frequently. The static embedding is trained with the whole text of each book, while the dynamic embeddings are based on sub-slices containing text of length equal to 1000 characters. For each character, we extract the corresponding trajectory for each book. For a better interpretation of the relations, we consider the main character (i.e., Harry) and plot the evolution over time of the relative (cosine) distances from the other characters. Since these vectors embed semantic information, it is expected that in the trajectories corresponding to smaller distances correspond to closer relations with Henry. In fact, the results in Figure <ref type="figure">2</ref> show that the trajectories of positive characters or friends (i.e., Ron and Hermione) move in a similar way. The main antagonist (i.e., Voldemort) is found instead to move in a di↵erent direction and at a higher distance. A similar analysis for LW is reported in Figure <ref type="figure" target="#fig_1">3</ref>. In this case we display a t-SNE <ref type="bibr" target="#b19">[MH08]</ref> two-dimensional projection of the vectors over di↵erent groups of chapters for the four major characters (i.e., the four March sisters). As a comment, the temporal word embedding seems to capture the separation, during the central part of the plot, between Joe and Amy, i.e., the two characters who left their home town, and the other two.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Conclusion</head><p>In this paper, we presented supervised and unsupervised DL models for analysing and interpreting character relations in a novel. We used BERT classifiers for predicting the character relations, while an unsupervised approach based on temporal word embeddings was used to interpret the character relation evolution. Both methods are found to be promising to explore the relations involved within characters in a novel. Thus, the approaches can be further applied to literary text understanding for deriving character networks and hence studying the relations and sentiments involved. In future, we want to integrate these approaches to build a more user-friendly tool to analyse the character networks and use it for an extensive validation.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Training temporal embeddings</figDesc><graphic coords="3,162.00,313.70,291.61,226.63" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 3 :</head><label>3</label><figDesc>Figure 2: Characters trajectories for Harry Potter</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>Character familial relation classification in Harry Potter</figDesc><table><row><cell>Samples</cell><cell>Precision</cell><cell>Recall</cell><cell>F-score</cell></row><row><cell>Negative Positive All</cell><cell>79 ± 13% 77 ± 9% 80 ± 4%</cell><cell>85 ± 7% 71 ± 9% 78 ± 7%</cell><cell>81 ± 8% 73 ± 4% 78 ± 7%</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">E.g., see https://nlp.stanford.edu/software/CRF-NER.html.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">See https://github.com/huggingface/transformers.</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Social network analysis of Alice in Wonderland</title>
		<author>
			<persName><forename type="first">Apoorv</forename><surname>Agarwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Augusto</forename><surname>Corvalan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jacob</forename><surname>Jensen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Owen</forename><surname>Rambow</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the NAACL-HLT 2012 Workshop on computational linguistics for literature</title>
				<meeting>the NAACL-HLT 2012 Workshop on computational linguistics for literature</meeting>
		<imprint>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="88" to="96" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Mining and modeling character networks</title>
		<author>
			<persName><surname>Bde + ; Anthony</surname></persName>
		</author>
		<author>
			<persName><forename type="first">David</forename><surname>Bonato</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D'</forename><surname>Ryan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ethan</forename><forename type="middle">R</forename><surname>Angelo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">David</forename><forename type="middle">F</forename><surname>Elenberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yangyang</forename><surname>Gleich</surname></persName>
		</author>
		<author>
			<persName><surname>Hou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International workshop on algorithms and models for the web-graph</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="100" to="114" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">ST-DBSCAN: An algorithm for clustering spatial-temporal data</title>
		<author>
			<persName><forename type="first">Derya</forename><surname>Birant</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alp</forename><surname>Kut</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Data &amp; Knowledge Engineering</title>
		<imprint>
			<biblScope unit="volume">60</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="208" to="221" />
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Dynamic word embeddings</title>
		<author>
			<persName><forename type="first">Robert</forename><surname>Bamler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Stephan</forename><surname>Mandt</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 34th International Conference on Machine Learning</title>
				<meeting>the 34th International Conference on Machine Learning</meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="volume">70</biblScope>
			<biblScope unit="page" from="380" to="389" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">The actortopic model for extracting social networks in literary narrative</title>
		<author>
			<persName><forename type="first">Dilek</forename><surname>Chth + ; Asli Celikyilmaz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hua</forename><surname>Hakkani-Tur</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Greg</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Denilson</forename><surname>Kondrak</surname></persName>
		</author>
		<author>
			<persName><surname>Barbosa</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">NIPS Workshop: Machine Learning for Social Computing</title>
				<imprint>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page">8</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Modeling evolving relationships between characters in literary novels</title>
		<author>
			<persName><forename type="first">Snigdha</forename><surname>Chaturvedi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Shashank</forename><surname>Srivastava</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hal</forename><surname>Daume</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Iii</forename></persName>
		</author>
		<author>
			<persName><forename type="first">Chris</forename><surname>Dyer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of AAAI</title>
				<meeting>AAAI</meeting>
		<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Universal sentence encoder for English</title>
		<author>
			<persName><forename type="first">Yinfei</forename><surname>Cyk + ; Daniel Cer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sheng-Yi</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nan</forename><surname>Kong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nicole</forename><surname>Hua</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Rhomni</forename><surname>Limtiaco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Noah</forename><surname>St John</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mario</forename><surname>Constant</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Steve</forename><surname>Guajardo-Cespedes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Chris</forename><surname>Yuan</surname></persName>
		</author>
		<author>
			<persName><surname>Tar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations</title>
				<meeting>the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="169" to="174" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Training temporal word embeddings with a compass</title>
		<author>
			<persName><forename type="first">Carlo</forename><surname>Valerio Di</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Federico</forename><surname>Bianchi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Matteo</forename><surname>Palmonari</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of AAAI</title>
				<meeting>AAAI</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="volume">33</biblScope>
			<biblScope unit="page" from="6326" to="6334" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Bert: Pre-training of deep bidirectional transformers for language understanding</title>
		<author>
			<persName><forename type="first">Jacob</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ming-Wei</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kenton</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kristina</forename><surname>Toutanova</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
		<title level="s">Long and Short Papers</title>
		<meeting>the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="4171" to="4186" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">Contextualized word embeddings enhanced event temporal relation extraction for story understanding</title>
		<author>
			<persName><forename type="first">Rujun</forename><surname>Han</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mengyue</forename><surname>Liang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Bashar</forename><surname>Alhafni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nanyun</forename><surname>Peng</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1904.11942</idno>
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Long short-term memory</title>
		<author>
			<persName><forename type="first">Sepp</forename><surname>Hochreiter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jürgen</forename><surname>Schmidhuber</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Neural computation</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="issue">8</biblScope>
			<biblScope unit="page" from="1735" to="1780" />
			<date type="published" when="1997">1997</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">USE and InferSent sentence encoders: The panacea for research-paper recommendation</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">Mohamed</forename><surname>Hsg + ; Hebatallah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Giuseppe</forename><surname>Hassan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Fabio</forename><surname>Sansonetti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alessandro</forename><surname>Gasparetti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Micarelli</surname></persName>
		</author>
		<author>
			<persName><surname>Beel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Elmo</forename><surname>Bert</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CEUR Workshop Proceedings</title>
				<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="volume">2431</biblScope>
			<biblScope unit="page" from="6" to="10" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Novel2graph: Visual summaries of narrative text enhanced by machine learning</title>
		<author>
			<persName><forename type="first">K</forename><surname>Vani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alessandro</forename><surname>Antonucci</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of Text2Story -Second Workshop on Narrative Extraction From Texts co-located with 41th European Conference on Information Retrieval (ECIR 2019)</title>
				<editor>
			<persName><forename type="first">Jorge</forename><surname>Mário</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Campos</forename><surname>Alípio</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Jatowt</forename><surname>Ricardo</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Bhatia</forename><surname>Adam</surname></persName>
		</editor>
		<editor>
			<persName><surname>Sumit</surname></persName>
		</editor>
		<meeting>Text2Story -Second Workshop on Narrative Extraction From Texts co-located with 41th European Conference on Information Retrieval (ECIR 2019)</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="29" to="37" />
		</imprint>
	</monogr>
	<note>CEUR Workshop Proceedings</note>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Statistically significant detection of linguistic change</title>
		<author>
			<persName><forename type="first">Vivek</forename><surname>Kulkarni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Rami</forename><surname>Al-Rfou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Bryan</forename><surname>Perozzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Steven</forename><surname>Skiena</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. of the 24th Int. Conf. on World Wide Web</title>
				<meeting>of the 24th Int. Conf. on World Wide Web</meeting>
		<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="625" to="635" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<title level="m" type="main">Adam: A method for stochastic optimization</title>
		<author>
			<persName><forename type="first">P</forename><surname>Diederik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jimmy</forename><surname>Kingma</surname></persName>
		</author>
		<author>
			<persName><surname>Ba</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1412.6980</idno>
		<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Diachronic word embeddings and semantic shifts: a survey</title>
		<author>
			<persName><forename type="first">Andrey</forename><surname>Kutuzov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lilja</forename><surname>Øvrelid</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Terrence</forename><surname>Szymanski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Erik</forename><surname>Velldal</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 27th International Conference on Computational Linguistics</title>
				<meeting>the 27th International Conference on Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="1384" to="1397" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Extraction and analysis of fictional character networks: A survey</title>
		<author>
			<persName><forename type="first">Vincent</forename><surname>Labatut</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Xavier</forename><surname>Bost</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Computing Surveys (CSUR)</title>
		<imprint>
			<biblScope unit="volume">52</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page" from="1" to="40" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<title level="m" type="main">Story ending prediction by transferable BERT</title>
		<author>
			<persName><forename type="first">Zhongyang</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Xiao</forename><surname>Ding</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ting</forename><surname>Liu</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1905.07504</idno>
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">A neural joint model for entity and relation extraction from biomedical text</title>
		<author>
			<persName><forename type="first">Fei</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Meishan</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Guohong</forename><surname>Fu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Donghong</forename><surname>Ji</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">BMC bioinformatics</title>
		<imprint>
			<biblScope unit="volume">18</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page">198</biblScope>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Visualizing data using t-SNE</title>
		<author>
			<persName><forename type="first">Laurens</forename><surname>Van Der Maaten</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Geo↵rey</forename><surname>Hinton</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of machine learning research</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="page" from="2579" to="2605" />
			<date type="published" when="2008-11">Nov). 2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">How transferable are neural networks in nlp applications?</title>
		<author>
			<persName><surname>Mmy + ; Lili</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Zhao</forename><surname>Mou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Rui</forename><surname>Meng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ge</forename><surname>Yan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yan</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lu</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Zhi</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><surname>Jin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing</title>
				<meeting>the 2016 Conference on Empirical Methods in Natural Language Processing</meeting>
		<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="479" to="489" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<title level="m" type="main">Distributed representations of words and phrases and their compositionality</title>
		<author>
			<persName><forename type="first">Ilya</forename><surname>Msc + ; Tomas Mikolov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kai</forename><surname>Sutskever</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Greg</forename><forename type="middle">S</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Je↵</forename><surname>Corrado</surname></persName>
		</author>
		<author>
			<persName><surname>Dean</surname></persName>
		</author>
		<editor>C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger</editor>
		<imprint>
			<date type="published" when="2013">2013</date>
			<publisher>NIPS</publisher>
			<biblScope unit="page" from="3111" to="3119" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Studying literary characters and character networks</title>
		<author>
			<persName><surname>Pahs + ; Andrew</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mark</forename><surname>Piper</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Koustuv</forename><surname>Algee-Hewitt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Derek</forename><surname>Sinha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hardik</forename><surname>Ruths</surname></persName>
		</author>
		<author>
			<persName><surname>Vala</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">DH</title>
		<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Deep contextualized word representations</title>
		<author>
			<persName><forename type="first">Mark</forename><surname>Matthew E Peters</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mohit</forename><surname>Neumann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Matt</forename><surname>Iyyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Christopher</forename><surname>Gardner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kenton</forename><surname>Clark</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Luke</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><surname>Zettlemoyer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of NAACL-HLT</title>
				<meeting>NAACL-HLT</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="2227" to="2237" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Glove: Global vectors for word representation</title>
		<author>
			<persName><forename type="first">Je↵rey</forename><surname>Pennington</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Richard</forename><surname>Socher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Christopher</forename><forename type="middle">D</forename><surname>Manning</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP)</title>
				<meeting>the 2014 conference on empirical methods in natural language processing (EMNLP)</meeting>
		<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="1532" to="1543" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Csi: A hybrid deep model for fake news detection</title>
		<author>
			<persName><forename type="first">Natali</forename><surname>Ruchansky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sungyong</forename><surname>Seo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yan</forename><surname>Liu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2017 ACM on Conference on Information and Knowledge Management</title>
				<meeting>the 2017 ACM on Conference on Information and Knowledge Management</meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="797" to="806" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">On social networks in plays and novels</title>
		<author>
			<persName><forename type="first">Amelia</forename><surname>Carolina</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sparavigna</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal of Sciences</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="issue">10</biblScope>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">Temporal word embeddings for narrative understanding</title>
		<author>
			<persName><forename type="first">Claudia</forename><surname>Volpetti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Vani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alessandro</forename><surname>Antonucci</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Twelfth International Conference on Machine Learning and Computing, ACM Press International Conference Proceedings Series</title>
				<meeting>the Twelfth International Conference on Machine Learning and Computing, ACM Press International Conference Proceedings Series</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
	<note>ICMLC 2020</note>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Attention is all you need</title>
		<author>
			<persName><surname>Vsp + ; Ashish</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Noam</forename><surname>Vaswani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Niki</forename><surname>Shazeer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jakob</forename><surname>Parmar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Llion</forename><surname>Uszkoreit</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Aidan</forename><forename type="middle">N</forename><surname>Jones</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lukasz</forename><surname>Gomez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Illia</forename><surname>Kaiser</surname></persName>
		</author>
		<author>
			<persName><surname>Polosukhin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">NIPS</title>
				<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="5998" to="6008" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
