<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">NOVEL2GRAPH: Visual Summaries of Narrative Text Enhanced by Machine Learning</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Alessandro</forename><surname>Antonucci</surname></persName>
							<email>alessandro@idsia.ch</email>
							<affiliation key="aff0">
								<orgName type="department">Istituto Dalle Molle di Studi sull&apos;Intelligenza Artificiale (IDSIA</orgName>
								<address>
									<settlement>Lugano</settlement>
									<country key="CH">Switzerland</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">NOVEL2GRAPH: Visual Summaries of Narrative Text Enhanced by Machine Learning</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">D5BF98FA20D7D453C9671EB8AC557827</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T00:57+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>A machine learning approach to the creation of visual summaries for narrative text is presented. Standard natural language processing tools for named entities recognition are used together with a clustering algorithm to detect the characters of the novel and their aliases. The most relevant ones and their relations are evaluated on the basis of a simple statistical analysis. These characters are visually depicted as nodes of an undirected graph whose edges describe relations with other characters. Specialized sentiment analysis techniques based on sentence embedding decide the colours of characters/nodes and their relations/edges. Additional information about the characters (e.g., gender) and their relations (e.g., siblings or partnerships) are returned by binary classifiers and visually depicted in the graph. For those specialized tasks, small amounts of manually annotated data are sufficient to achieve good accuracy. Compared to analogous tools, the machine learning approach we present allows for a richer representation of texts of this kind. A case study to demonstrate this approach for a series of books is also reported.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>The analysis and interpretation of potentially long and complex literary texts with the goal of extracting readable and concise information is a challenging task even for trained human experts. Developing AI systems able to support (or even replace) humans in such activities is therefore an exciting challenge for research in natural language processing. In fact the analysis of literary texts involves various complex steps such as the identification of the main characters and relations, and the corresponding typification (e.g., gender, partnerships, goodness). Moreover the high variance in the style and the lexicon with frequent use of neologisms [MC + 14] and figures of speech <ref type="bibr" target="#b13">[Nyg06]</ref> further complicates the scenario. Manually created summaries based on graphs such as the one in Figure <ref type="figure" target="#fig_0">1</ref> can be therefore regarded as useful tools to better understand complex plots.</p><p>In the very recent years, natural language processing (NLP) is taking advantage of machine learning techniques based on neural networks with a deep structure and trained by huge amounts of text data [MSC + 13]. Word and sentence embedding techniques are now the state-of-the-art representation models in NLP [LGD15, <ref type="bibr" target="#b16">SJV19,</ref><ref type="bibr" target="#b19">WBGL15]</ref>. The main goal of this paper is to show how these modern machine learning tools as well as more traditional techniques (to be used if only small amounts of training data are available or needed) can significantly improve the quality of the information extracted from a narrative text, thus enhancing the readability of the corresponding summaries. In particular we focus on the creation of visual outlines based on graphs, analogous to what readers and specialists are often doing to understand the plot of complex novels as the one in Figure <ref type="figure" target="#fig_0">1</ref>.</p><p>Our approach to literary text analysis encompasses various NLP techniques such as named entity recognition, syntactic parsing, semantic text analysis and NLP application tasks such as gender recognition, sentiment analysis and social network analysis. Most of these tasks have been deeply studied in the literature and standard tools are available for them. Yet, some tuning and adaptation to the specific case of narrative text is required, mainly because of the inherent nature of text content. For instance, sentiment analysis is generally intended to process short texts such as product reviews or social networks comments. Using those tools for narrative texts can produce inaccurate evaluations and (Section 4), as collecting dedicated training data might be problematic, tools for semantic learning can instead give good results [HM17, KIO + 16, MPGC17, NS07, NS19]. Besides sentiment analysis, extensive textual pre-processing is needed for literary texts because of the difficulty in terms of interpretability, as experienced even by humans.</p><p>Overall, we present a combination of natural language processing (NLP) and machine learning (ML), with a particular focus on current trends of deep learning (DL), adapted to the analysis and visualization of textual narrative data. DL approaches are used to improve natural language understanding (NLU) of text together with specialized syntactic and semantic analyses to recognize the characters and their relations. Furthermore, for the extraction of the character and relationship sentiments, we use the classification and prediction approaches based on the universal sentence embedding (USE, [CYK + 18]). USE helps in transfer learning using the transformer architectures [VSP + 17] or deep averaging networks <ref type="bibr" target="#b6">[IMBGDI15]</ref>. Moreover, family relations between the characters are reduced to binary classification tasks. The proposed method shows the strength of these techniques to achieve a good understanding of this particular kind of texts.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Existing Work</head><p>The automatic interpretation and visual analysis of literary texts has been explored in the past few years from various perspectives. Most of the studies in this area focus on different methodologies to detect the characters of a novel and identifying their relationships. One of the works focused on character detection proposes a new approach to named entity recognition based on co-reference resolutions and based on a set of heuristic rules integrating the Stanford CoreNLP.<ref type="foot" target="#foot_0">1</ref>  <ref type="bibr" target="#b17">[VJPR15]</ref>. A similar tool<ref type="foot" target="#foot_1">2</ref> for efficient character annotations and alias resolutions was proposed by [VDJ + 16]. Sentiment analysis for the characters and their relations is another focus area. Besides the Stanford CoreNLP, the approach in <ref type="bibr" target="#b4">[FPU15]</ref> uses also the SentiWordNet ontology<ref type="foot" target="#foot_2">3</ref> to build character networks. In [PAHS + 17], the literary characters and the network associations have been also studied, where the features that defines character qualities are identified, while in <ref type="bibr" target="#b10">[NB13]</ref> sentiment relations between characters in Shakespeare's plays have been processed using AFINN sentiment lexicon<ref type="foot" target="#foot_3">4</ref> . Some of the reported studies also presented a dynamical analysis of the novels, this taking into account that the character traits and relations varies over the pages. In <ref type="bibr" target="#b3">[Els12]</ref>, a kernel method to capture novelistic plots over time has been considered. In <ref type="bibr" target="#b7">[KC12]</ref> a map of the emotions over time to indicate the information flow over time was used to predict the emotions. Gender based stereotypes in novels are studied in <ref type="bibr" target="#b1">[Cri16]</ref> analysing the masculine and feminine adjectives and the gender-biased portrayal of characters. The present work should be intended as one of the first attempts to apply the recent advances of DL in NLP in the particular area of the creation of visual summaries for narrative text.s</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Methodology</head><p>The algorithms we adopt to generate visual summaries of narrative text have been shaped as a single Python software tool called NOVEL2GRAPH. The associated workflow is depicted in Figure <ref type="figure" target="#fig_2">2</ref>. The three sections here below describe the main NOVEL2GRAPH modules, while the very last step, i.e., the actual creation of the graph is based on the Graphviz free software.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Character Identification and Alias Clustering</head><p>The first module of the flow handles identification of the characters in novel. This also involves the recognition of aliases, as the author might refer to the same character in different ways. The Stanford named entity recognizer (NER)<ref type="foot" target="#foot_5">6</ref> is initially used to detect all the person entities. Afterwards, part-of-speech tagging is applied to filter out the false positives returned by the NER. The basic pre-processing such as tokenisation and stopwords removal are also applied. The so-obtained list of persons is finally processed by the DBSCAN clustering algorithm <ref type="bibr" target="#b0">[BK07]</ref> to group together the aliases of a same character. As a pairwise distance between the person names we use the Ratcliff-Obershelp algorithm <ref type="bibr" target="#b15">[RM88]</ref>. A dictionary with the representative character names (corresponding to the longest alias of each group) as keys and the aliases as values is eventually created.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Character and Relation Traits Identification</head><p>This module focuses on the extraction of character traits, which are supposed to describe the role of the character or the quality of a relation between two characters. For such a typification, we extract the phrasal information from the text that contains the character (or an alias) name. A syntactic parse tree is therefore constructed by means of the pattern parse tool. <ref type="foot" target="#foot_6">7</ref> The number of phrases related to a character decides the size of the corresponding node in the graphical visualization. As the phrases can be too short to include a co-occurrence, for character pair relations, we instead extract from the parse tree the sentences in which the character pair co-occurs. The number of sentences related to a character pair defines the size of the edge connecting the nodes of the two characters in the graph. This information is also used as the input for sentiment, gender and relation type analysis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Sentiment, Gender and Relationship Type Analysis</head><p>This module focuses on the identification of (positive or negative) sentiments describing the character traits and the relationships with other characters. Phrasal and sentence information extracted from the previous module are processed by two different methods: (a) the NLTK-VADER sentiment analyser;<ref type="foot" target="#foot_7">8</ref> (b) the Universal Sentence Encoder (USE). <ref type="foot" target="#foot_8">9</ref> The goal is to compare the two approaches in terms of accuracy in the classification of the possibly intricate sentiments occurring in a narrative text. NLTK-VADER uses a rule-based lexicon method built up with words annotated in terms of positive or negative sentiments. The phrasal and sentence information is directly processed by the sentiment analyser and the corresponding sentiscores, giving positive, negative or neutral weights to the sentiments, are recorded. The tool eventually computes the compound score (i.e., the average of sentiscores) which defines the sentiment. The second approach, based on DL, requires a supervised training, and allows to formulate the problem as a classification task. USE helps to capture the sentence meanings and their relations representing them as embeddings at the sentence level. For the training, we use lists of adjectives collected online.<ref type="foot" target="#foot_9">10</ref> Adjectives seem to be the most informative tags to reasonably describe a character or a relationship. Similar techniques based on USE are also used for gender identification, for which dedicated and sufficiently large training sets are already available. In both cases, we used the Keras model with 256 dense layers and a prediction layer with softmax activation, Adam optimisation, and categorical loss entropy. The model is trained with 10 epochs on 32 batches. Another focus is to extract the relation types, which can describe the family links or other relations such as siblings, friends, enemies, killers. For this task we created manually annotated data of small size, which are instead processed with classical ML algorithms and currently we focus only on the sibling relations. s</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Empirical Analysis on a Case Study</head><p>For a preliminary validation of our methodology, we fed NOVEL2GRAPH with the six books of the Harry Potter series by J.K. Rowling. Due to the lack of space, in Figures <ref type="figure" target="#fig_4">4 and 5</ref> we depict the graphs of two books, with two thresholds: only the most important characters of each book are depicted; only relations expressed more than a given threshold are displayed. The size of nodes and the thickness of edges increase linearly with the number of occurrences in the text. Red denotes negative sentiments and blue the positive ones. Relatives recognized by relation type classifiers are connected by dashed edges. A small focus group of persons who carefully read the books was asked to decide the quality of the detected relations and sentiments and the comments have been unanimously positive.</p><p>For a quantitative evaluation, we focus on the sentiment analysis. The focus group was asked to annotate the sentiments of the characters and the relations as positive or negative. While NLTK-VADER was originally trained by a large data set, we train USE with a data set of only 4000 adjective conveying positive or negative sentiments. Sentiment analysis accuracy of NLTK-VADER and USE are compared in Figure <ref type="figure" target="#fig_3">3</ref>. The specialized USE, outperforms the general purpose NLTK-VADER, which confirms the strength of transfer learning when used in specific domains such as narrative text. The combination of NLP and DL approaches seems therefore to ease the text understanding. USE gives similar accuracies (71% on average over all the six books) for gender analysis. As a first test for the relationships classification, a simple naive Bayes classifier with bag-of-words features and trained with a small data set of 100 manually annotated sentences was able to detect the notion of sibling with full accuracy. The present work attempted to explore and analyse the synergy of natural language processing and deep learning for the analysis and summarization of literary texts. We see many directions to improve the techniques proposed here. First, we need to test the tool on an extensive benchmark of novels. The dynamic evolution of those graphs, either incrementally or by using a fixed-length window over the pages, could be expressed by an animation. To properly achieve that, the problem of a proper temporal positioning of the detected events, which are currently described only by their position in the text, should be tackled. Regarding the character identification module, deeper analysis based on co-reference resolutions and contextual analysis should be considered. Regarding the analysis of sentiments, gender and relationships, improvements can be achieved by recognizing the roles of the characters within the context. While semantic role labelling could be used for that by labelling triplets (subject, predicate, object) in a better way, in narrative texts the structure of the sentence may not follow a specific syntactic rule. In many cases, the literary texts contain many conversations enriched with semantic usages. To tackle the problem more effectively, it is important that the current cutting edge deep learning approaches are used in conjunction with the NLP techniques to achieve better text understanding. </p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Manually created visual outline of And the Mountains Echoed by K. Hosseini</figDesc><graphic coords="2,192.75,54.07,226.77,264.80" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head></head><label></label><figDesc>5   </figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: The NOVEL2GRAPH workflow</figDesc><graphic coords="3,137.70,303.65,340.21,318.46" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Comparison of NLTK-VADER and USE approaches to sentiment analysis</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: NOVEL2GRAPH output of the second book of the Harry Potter series</figDesc><graphic coords="6,89.10,155.99,437.40,429.29" type="bitmap" /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://stanfordnlp.github.io/CoreNLP</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">https://github.com/hardik-vala/charles</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">https://sentiwordnet.isti.cnr.it</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">https://github.com/fnielsen/afinn</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_4">https://www.graphviz.org</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_5">https://nlp.stanford.edu/software/CRF-NER.shtml</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_6">https://www.clips.uantwerpen.be/pages/pattern-en</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_7">https://www.nltk.org/_modules/nltk/sentiment/vader.html</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="9" xml:id="foot_8">https://tfhub.dev/google/universal-sentence-encoder/2</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="10" xml:id="foot_9">https://lesn.appstate.edu/fryeem/re4030/character_trait_descriptive_adje.htm</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgements</head><p>We acknowledge Arpitha Prasad Bharathi, who helped us in evaluating our results with the Harry Potter series throughout the work, and Markus Zohner, who first suggested us the idea of developing visual outlines of novels by means of AI techniques.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">ST-DBSCAN: An algorithm for clustering spatial-temporal data</title>
		<author>
			<persName><forename type="first">Derya</forename><surname>Birant</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alp</forename><surname>Kut</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Data &amp; Knowledge Engineering</title>
		<imprint>
			<biblScope unit="volume">60</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="208" to="221" />
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">The persistence of gender-based stereotypes in the language of Harry Potter and the Philosopher&apos;s Stone and Harry Potter and the Goblet of Fire</title>
		<author>
			<persName><forename type="first">Rebecca</forename><surname>Cripps</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">Universal sentence encoder</title>
		<author>
			<persName><forename type="first">Yinfei</forename><surname>Cyk + ; Daniel Cer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sheng-Yi</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nan</forename><surname>Kong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nicole</forename><surname>Hua</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Rhomni</forename><surname>Limtiaco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Noah</forename><surname>St John</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mario</forename><surname>Constant</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Steve</forename><surname>Guajardo-Cespedes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Chris</forename><surname>Yuan</surname></persName>
		</author>
		<author>
			<persName><surname>Tar</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1803.11175</idno>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Character-based kernels for novelistic plot structure</title>
		<author>
			<persName><forename type="first">Micha</forename><surname>Elsner</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics</title>
				<meeting>the 13th Conference of the European Chapter of the Association for Computational Linguistics</meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="634" to="644" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m" type="main">Extracting social network from literature to predict antagonist and protagonist</title>
		<author>
			<persName><forename type="first">Matt</forename><surname>Fernandez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Michael</forename><surname>Peterson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ben</forename><surname>Ulmer</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Deep learning approach for sentiment analysis of short texts</title>
		<author>
			<persName><forename type="first">Abdalraouf</forename><surname>Hassan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ausif</forename><surname>Mahmood</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Control, Automation and Robotics (ICCAR), 2017 3rd International Conference on</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="705" to="710" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Deep unordered composition rivals syntactic methods for text classification</title>
		<author>
			<persName><forename type="first">Mohit</forename><surname>Iyyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Varun</forename><surname>Manjunatha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jordan</forename><surname>Boyd-Graber</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hal</forename><surname>Daumé</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Iii</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing</title>
				<meeting>the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing</meeting>
		<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="1681" to="1691" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Mapping emotions through time: How affective trajectories inform the language of emotion</title>
		<author>
			<persName><forename type="first">Tabitha</forename><surname>Kirkland</surname></persName>
		</author>
		<author>
			<persName><forename type="first">William</forename><forename type="middle">A</forename><surname>Cunningham ; Ankit ; Kumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ozan</forename><surname>Irsoy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Peter</forename><surname>Ondruska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mohit</forename><surname>Iyyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">James</forename><surname>Bradbury</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ishaan</forename><surname>Gulrajani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Victor</forename><surname>Zhong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Romain</forename><surname>Paulus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Richard</forename><surname>Socher</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Machine Learning</title>
				<imprint>
			<date type="published" when="2012">2012. 2016</date>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="page" from="1378" to="1387" />
		</imprint>
	</monogr>
	<note>Ask me anything: Dynamic memory networks for natural language processing</note>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Improving distributional similarity with lessons learned from word embeddings</title>
		<author>
			<persName><forename type="first">Omer</forename><surname>Levy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yoav</forename><surname>Goldberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ido</forename><surname>Dagan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">;</forename><surname>Carbajal</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Transactions of the Association for Computational Linguistics</title>
		<editor>Marina Martínez</editor>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page" from="211" to="225" />
			<date type="published" when="2014">2015. 2014</date>
		</imprint>
	</monogr>
	<note>MC + 14. Neologisms in Harry Potter books</note>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Distributed representations of words and phrases and their compositionality</title>
		<author>
			<persName><forename type="first">Navonil</forename><surname>Majumder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Soujanya</forename><surname>Poria</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alexander</forename><surname>Gelbukh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Erik</forename><surname>Cambria</surname></persName>
		</author>
		<author>
			<persName><forename type="first">;</forename><surname>Mikolov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kai</forename><surname>Sutskever</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Greg</forename><forename type="middle">S</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jeff</forename><surname>Corrado</surname></persName>
		</author>
		<author>
			<persName><surname>Dean</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems</title>
				<editor>
			<persName><forename type="first">C</forename><forename type="middle">J C</forename><surname>Burges</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Bottou</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Welling</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Z</forename><surname>Ghahramani</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K</forename><forename type="middle">Q</forename><surname>Weinberger</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2013">2017. 2013</date>
			<biblScope unit="volume">32</biblScope>
			<biblScope unit="page" from="3111" to="3119" />
		</imprint>
	</monogr>
	<note>Deep learning-based document modeling for personality detection from text</note>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Character-to-character sentiment analysis in Shakespeare&apos;s plays</title>
		<author>
			<persName><forename type="first">T</forename><surname>Eric</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Henry</forename><forename type="middle">S</forename><surname>Nalisnick</surname></persName>
		</author>
		<author>
			<persName><surname>Baird</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics</title>
				<meeting>the 51st Annual Meeting of the Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2013">2013</date>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page" from="479" to="483" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">A survey of named entity recognition and classification</title>
		<author>
			<persName><forename type="first">David</forename><surname>Nadeau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Satoshi</forename><surname>Sekine</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Lingvisticae Investigationes</title>
		<imprint>
			<biblScope unit="volume">30</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="3" to="26" />
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Natural language processing: Speaker, language, and gender identification with LSTM</title>
		<author>
			<persName><forename type="first">Mohammad</forename><forename type="middle">K</forename><surname>Nammous</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Khalid</forename><surname>Saeed</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advanced Computing and Systems for Security</title>
				<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="143" to="156" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<title level="m" type="main">Rowling&apos;s Harry Potter and the Philosopher&apos;s Stone</title>
		<author>
			<persName><forename type="first">;</forename><forename type="middle">J</forename><surname>Åsa Nygren</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
	<note>Essay on the linguistic features</note>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Studying literary characters and character networks</title>
		<author>
			<persName><surname>Pahs + ; Andrew</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mark</forename><surname>Piper</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Koustuv</forename><surname>Algee-Hewitt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Derek</forename><surname>Sinha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hardik</forename><surname>Ruths</surname></persName>
		</author>
		<author>
			<persName><surname>Vala</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">DH</title>
		<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Pattern matching: The gestalt approach</title>
		<author>
			<persName><forename type="first">John</forename><forename type="middle">W</forename><surname>Ratcliff</surname></persName>
		</author>
		<author>
			<persName><forename type="first">David</forename><forename type="middle">E</forename><surname>Metzener</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Dr Dobbs Journal</title>
		<imprint>
			<biblScope unit="volume">13</biblScope>
			<biblScope unit="issue">7</biblScope>
			<biblScope unit="page">46</biblScope>
			<date type="published" when="1988">1988</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Annotating characters in literary corpora: A scheme, the CHARLES tool, and an annotated novel</title>
		<author>
			<persName><forename type="first">Patricia</forename><forename type="middle">A</forename><surname>Roger Alan Stein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">João</forename><surname>Jaques</surname></persName>
		</author>
		<author>
			<persName><surname>Francisco Valiati ; Hardik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Stefan</forename><surname>Vala</surname></persName>
		</author>
		<author>
			<persName><forename type="first">David</forename><surname>Dimitrov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Andrew</forename><surname>Jurgens</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Derek</forename><surname>Piper</surname></persName>
		</author>
		<author>
			<persName><surname>Ruths</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">LREC</title>
				<imprint>
			<date type="published" when="2016">2019. 2016</date>
			<biblScope unit="volume">471</biblScope>
			<biblScope unit="page" from="216" to="232" />
		</imprint>
	</monogr>
	<note>An analysis of hierarchical text classification using word embeddings</note>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">his coachman, and the archbishop walk into a bar but only one of them gets recognized: On the difficulty of detecting characters in literary texts</title>
		<author>
			<persName><forename type="first">Hardik</forename><surname>Vala</surname></persName>
		</author>
		<author>
			<persName><forename type="first">David</forename><surname>Jurgens</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Andrew</forename><surname>Piper</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Derek</forename><surname>Ruths</surname></persName>
		</author>
		<author>
			<persName><surname>Mr</surname></persName>
		</author>
		<author>
			<persName><surname>Bennet</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing</title>
				<meeting>the 2015 Conference on Empirical Methods in Natural Language Processing</meeting>
		<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="769" to="774" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Attention is all you need</title>
		<author>
			<persName><forename type="first">Ashish</forename><surname>Vaswani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Noam</forename><surname>Shazeer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Niki</forename><surname>Parmar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jakob</forename><surname>Uszkoreit</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Llion</forename><surname>Jones</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Aidan</forename><forename type="middle">N</forename><surname>Gomez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lukasz</forename><surname>Kaiser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Illia</forename><surname>Polosukhin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems</title>
				<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="5998" to="6008" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<title level="m" type="main">Towards universal paraphrastic sentence embeddings</title>
		<author>
			<persName><forename type="first">John</forename><surname>Wieting</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mohit</forename><surname>Bansal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kevin</forename><surname>Gimpel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Karen</forename><surname>Livescu</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1511.08198</idno>
		<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
