<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Towards the Automatic Analysis of the Structure of News Stories</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Iqra</forename><surname>Zahid</surname></persName>
							<email>iqra.zahid@student.manchester.ac.uk</email>
							<affiliation key="aff0">
								<orgName type="department" key="dep1">School of Arts</orgName>
								<orgName type="department" key="dep2">Languages and Cultures</orgName>
								<orgName type="institution">University of Manchester</orgName>
								<address>
									<country key="GB">UK</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Hao</forename><surname>Zhang</surname></persName>
							<email>hao.zhang-17@postgrad.manchester.ac.uk</email>
							<affiliation key="aff1">
								<orgName type="department">School of Computer Science</orgName>
								<orgName type="institution">University of Manchester</orgName>
								<address>
									<country key="GB">UK</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Frank</forename><surname>Boons</surname></persName>
							<email>frank.boons@manchester.ac.uk</email>
							<affiliation key="aff2">
								<orgName type="institution" key="instit1">Alliance Manchester Business</orgName>
								<orgName type="institution" key="instit2">School University of Manchester</orgName>
								<address>
									<country key="GB">UK</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Riza</forename><surname>Batista-Navarro</surname></persName>
							<affiliation key="aff3">
								<orgName type="department">School of Computer Science</orgName>
								<orgName type="institution">University of Manchester</orgName>
								<address>
									<country key="GB">UK</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Towards the Automatic Analysis of the Structure of News Stories</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">3F2F802C1BD5C61BC8DB11678FF4C65D</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T00:56+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>News stories are distinct from other types of narratives in that they typically follow a complex and non-chronological time structure. This poses challenges to the narrative analysis of news, specifically with respect to the construction of event sequences. In this paper, we propose to segment news story text according to news schema categories, which allow for identifying sentences describing a news story's main action and other actions that happened beforehand or subsequently. To automate this task, we made observations on the linguistic devices that are used by news writers, based on a manually annotated corpus of news articles that we have constructed. Heuristics capturing these linguistic devices were then developed, underpinned by natural language processing tools as well as carefully curated look-up lists of cues. While encouraging preliminary results were obtained, the work can be further expanded by observing and capturing more linguistic devices, which can be facilitated by further annotation of news stories based on news schema categories.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>In analysing narratives, understanding the sequence in which events occur is key <ref type="bibr" target="#b3">[Ell05]</ref>. Most types of narratives, e.g., novels, personal accounts of experiences, present events in chronological order. However, news stories, narratives that are written or recorded to "inform the public about current events, concerns or ideas" <ref type="bibr">[Whi]</ref>, deviate from other types of narratives in that they follow a complex time structure. News writers are expected to prioritise certain news values, i.e., criteria for judging "newsworthiness" (e.g., negativity, unexpectedness, superlativeness) <ref type="bibr" target="#b0">[Bel91]</ref>. In producing news stories that adhere to such news values, news writers adopt the Figure <ref type="figure">1</ref>: News schema proposed by Allan Bell <ref type="bibr" target="#b0">[Bel91]</ref>. Shown in grey are the most specific categories in the schema.</p><p>instalment method, whereby an event that was introduced in the earlier parts of a story, may be described in detail only later on in the story, possibly in multiple, separate instances. Consequently, events are usually presented in news stories in a non-chronological order.</p><p>In order to understand the flow of events in news stories, it is necessary to analyse their schema, i.e., the overall form of news discourse, by which topics are organised. A news schema defines the syntax of news stories, providing a set of formal categories that form the basis of the hierarchical organisation and ordering of textual units <ref type="bibr" target="#b2">[Dij85]</ref>. In early work by Labov and Waletzky on discourse analysis, categories such as Abstract, Orientation, Complicating Action, Evaluation, Resolution and Coda were proposed in order to organise narratives of personal experiences <ref type="bibr" target="#b6">[LW67]</ref>. Building upon that work, van Dijk <ref type="bibr" target="#b2">[Dij85]</ref> developed a schema specifically for analysing news discourse. Each category in the schema, e.g., Main Event, Background and History, corresponds to a piece of text, i.e., a sequence of sentences. According to case studies carried out on hundreds of news reports published in more than 260 newspapers from 100 countries, this news schema is applicable at an international scale <ref type="bibr">[vD98]</ref>. A few years later, building upon van Dijk's work, Bell proposed a finer-grained news schema <ref type="bibr" target="#b0">[Bel91]</ref>. We reproduce a tree-like depiction of this schema in Figure <ref type="figure">1</ref>, in which the most specific (or lowest-level) categories are shown in grey.</p><p>Since the chronological order of events is not maintained in news stories (as discussed above), narrative analysis of news is more challenging, compared to that of other types of narratives (e.g., novels). As human readers, we are accustomed to the style of reporting employed in news stories, and thus we might find the task of determining the correct sequence of events a simple and straightforward task. However, to an automated system designed to support narrative analysis, the non-chronological order in which events are presented in news stories would pose a barrier in the reconstruction of event sequences.</p><p>In this paper, we aim to facilitate machine understanding of news stories by automatically decomposing them according to news schema categories. To this end, we firstly developed a corpus of written news stories in which spans of text corresponding to news schema categories have been manually annotated and labelled following the work of Bell <ref type="bibr" target="#b0">[Bel91]</ref>. We then identified the various linguistic devices that are usually employed by news writers, that can help in the task of mapping news story text, to respective news schema categories. On the basis of these, a heuristics-based approach was developed in order to automate the said task.</p><p>The remainder of this paper is organised as follows. Section 2 presents a review of previously reported related work. In Section 3, an analysis of linguistic devices used in the different news schema categories is presented, supported by an annotated corpus that we have recently developed. We also provide a discussion of the heuristics that were developed to detect the use of such linguistic devices, in order to identify parts of news story text that correspond to schema categories. Our preliminary results are then discussed in Section 4. Finally, we conclude and present our next steps in Section 5.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Related Work</head><p>Most of the efforts that have been carried out in the way of analysing news were focussed on identifying boundaries between news stories, rather than on analysing their structure individually. Early work employed cue words as well as named entities (e.g., names of people, places, organisations) in closed caption text, in order to detect transitions from one news story to another <ref type="bibr" target="#b8">[MMM97]</ref>. The Broadcast News Navigator (BNN) system similarly used cue words and named entities in segmenting closed caption text according to individual news stories <ref type="bibr" target="#b7">[May98]</ref>. Additionally, by selecting the sentence with the highest frequency of named entities, the system was able to generate a gist for each news story-slightly similar to the Action category in Bell's framework which pertains to the central or main action in a news story.</p><p>TextTiling, a text segmentation approach proposed by Hearst, was applied to the detection of boundaries between consecutive news articles in the Wall Street Journal <ref type="bibr" target="#b4">[Hea97]</ref>. Specifically, the approach segmented text according to subtopics, which were identified through the measurement of lexical cohesion. Other work sought to improve the definition and measurement of lexical cohesion, by incorporating richer features (e.g., number of pronouns, similarities obtained by Latent Dirichlet Allocation), and were applied to the detection of news story boundaries in transcripts of broadcast news <ref type="bibr" target="#b15">[Sto03,</ref><ref type="bibr" target="#b12">RH06,</ref><ref type="bibr" target="#b10">PMD09]</ref>.</p><p>More similar to our own work are efforts aimed at segmenting individual narratives. The work of Kauchak and Chen <ref type="bibr" target="#b5">[KC05]</ref> was aimed at segmenting an individual narrative according to the topics it contains, casting the problem as a text classification task in which features such as word groups (identified with the aid of WordNet) and entity groups were learned using machine learning-based methods, i.e., support vector machines and decision stumps (one-level decision trees). This approach was applied on books autobiography-style books and encyclopaedia articles. Our approach is distinct from this in that we seek to segment individual news stories, which as discussed in the previous section, follow a different structure relative to other types of narratives. While the work of Cardoso et al. <ref type="bibr" target="#b1">[CTP13]</ref> was targeted towards the analysis of news text (written in Brazilian Portuguese), they also used topics as the basis of segmentation.</p><p>Our work aims to segment a narrative with the end-goal of following the flow or sequence of events, rather than identifying the different topics or themes it contains. While this bears similarities with the narrative segment annotation task proposed by Reiter <ref type="bibr" target="#b11">[Rei15]</ref> which was manually applied to short stories, our approach is specifically aimed at automatically analysing news stories. The topic of a news story may consist of multiple interconnected events, and thus can be segmented according to news schema categories delineating the main event from events leading to it and following it. To the best of our knowledge, our proposed approach is the first to attempt to automatically analyse the structure of news stories in this way.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Methodology</head><p>Our proposed approach to the automatic segmentation of news text according to news schema categories is based on the analysis of the various linguistic devices used by writers.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Corpus development</head><p>In order to support our analysis of linguistic devices, we developed a small corpus of written news articles, retrieved using the LexisNexis library<ref type="foot" target="#foot_0">1</ref> . Containing a total of 22 articles from various news agencies (listed in Appendix A), the corpus is partitioned into two: (1) a pre-2005 set, mostly containing news stories published in the 1980s and 1990s, that were used by van Dijk <ref type="bibr" target="#b2">[Dij85,</ref><ref type="bibr" target="#b16">vD98]</ref> and Bell <ref type="bibr" target="#b0">[Bel91]</ref> in designing their respective frameworks; and (2) a post-2005 set, containing eight news stories published more recently.</p><p>An annotation scheme was designed requiring annotators to map pieces of news text to the most specific categories of the news schema tree (shown in grey in Figure <ref type="figure">1</ref>). Guidelines were then established, to promote consistency between annotators. Specifically, annotators were asked to provide annotations at the sentence level. That is, category labels were assigned to individual sentences, not to whole paragraphs (sequence of sentences) nor to clauses or phrases (parts of sentences). If a sequence of sentences corresponds to only one category, then each sentence in that sequence was labelled as that one category. On each sentence, only one of eight of the most specific news schema categories (Action, Reaction, Consequence, Context, Evaluation, Expectation, Previous episode, History) was applied. We refer the reader to Appendix B for the definitions of these categories, as well as corresponding example sentences coming from a news story. Annotators were encouraged to firstly identify sentences in a news story that pertain to Action, as the other seven news schema categories are defined in relation to it. In cases where a sentence seemed to map to multiple categories, the annotator was asked to choose only one based on his or her best judgement.</p><p>Using the brat rapid annotation tool [SPT + 12], two annotators carried out the annotation task. One annotator, a final-year linguistics student (the first author of this paper), marked up all of the 22 articles. The other annotator, a researcher with expertise in natural language processing and text mining (the last author), annotated only the post-2005 set. Shown in Figure <ref type="figure" target="#fig_0">2</ref> is a sample news story annotated and visualised in brat.</p><p>The resulting corpus contains a total of 570 sentences. The average number of sentences is 26, with the shortest and longest news stories containing 11 and 53 sentences, respectively. Shown in Figure <ref type="figure" target="#fig_1">3</ref> is the number of annotated sentences in the corpus for each news schema category. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Linguistic Devices</head><p>Utilising our manually annotated corpus, we made observations on the various linguistic devices used by news story writers, as we posit that these are helpful in discriminating between different news schema categories. These observations are discussed for each of the eight most specific news schema categories, together with our proposed heuristics for automatically capturing them.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.1">Action.</head><p>We observed that sentences pertaining to Action, defined as a central or main action in a news story, share lexical similarities with the title of the news story. For example, in the news article published by the BBC News on the 21st December 2018 entitled, "Cheshire BMW driver jailed over speeding ticket lie", the following sentences pertain to Action: (1) "A man who claimed his BMW had been cloned as part of an elaborate scam to avoid a speeding fine has been jailed."; and (2) "Robson was jailed for nine months at Chester Crown Court on Thursday." Words in these sentences such as "jailed" and "speeding" are shared (verbatim) with the title of the news story. Based on this observation, we automatically identified text pertaining to Action by checking for exact matches between the lemmatised words of a sentence and those of the news story title.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.2">Reaction.</head><p>A Reaction is a verbal response to an Action given by an actor. A commonly used linguistic device used by writers to describe Reactions is attribution, which indicates "who expressed what", where what pertains to a quotation or perception, and "who" denotes its original source, i.e., the actor. The information that is commonly attributed is direct speech (e.g., He said, "She will deny it.") or indirect speech (e.g., He said that she will deny it.) in which reporting verbs (e.g., "said", "announce", "comment", "mention") are often used. However, verbs that are less neutral and bear either positive or negative connotations may also be used, such as "applaud", "praise" and "complain".</p><p>To detect whether a sentence contains a Reaction, we leveraged previously reported work on attribution extraction, which is underpinned by a lexicon of attribution verbs <ref type="bibr">[ZBBNss]</ref>. As Reactions are responses to Actions, they often contain mentions that co-refer to either the Action itself or actors participating in it. Hence, a check for the use of definite noun phrases was also implemented in order to detect whether an attributed quotation contains any co-referring mentions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.3">Consequence.</head><p>A Consequence is an occurrence that transpires as a result of an Action, with the exception of verbal responses (which are classified as Reactions). As such, Consequences often contain mentions that co-refer to either the Action itself or actors participating in it. Furthermore, discourse connectives signifying causation (e.g., "as a result", "because", "thereby") tend to be used in sentences pertaining to Consequence. In order to detect the use of such linguistic devices, we checked for the existence of definite noun phrases in sentences, as well as for the use of any of the discourse connectives annotated in the Penn Discourse Treebank (PDTB) [PDL + 08] that denote the Contingency relation (with a minimum frequency = 4).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.4">Evaluation.</head><p>Evaluation consists of observations on an Action provided by the news writer (i.e., journalist) or an actor, that assesses its impact or significance. As in Reaction, attribution is often used in Evaluation, specifically in cases where the observations are coming from actors. However, sentences conveying Evaluation can be identified by checking for the presence of graded adjectives, as these often indicate assessment, i.e., the degree to which a quality holds (e.g., "deep", "strongest", "biggest"). In support of this step, more than 260 graded adjectives were collected from the Collins Cobuild Grammar Patterns reference book <ref type="bibr" target="#b13">[Sin98]</ref> and compiled into a look-up list.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.5">Expectation.</head><p>Like Evaluation, Expectation is comprised of observations provided by the news writer or an actor (in which case attribution is also used), but pertains to their views on what could happen in the future. As such, sentences corresponding to Expectation make use of speculative language. To facilitate the detection of speculative language, we checked for the presence of modal verbs (e.g., "could", "may"), as well as for presence of modifiers that indicate uncertainty. A list of such modifiers was drawn from uncertainty cues in the WikiWeasel 2.0 corpus, that were manually annotated by Vincze <ref type="bibr" target="#b17">[Vin14]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.6">Context.</head><p>Similar to Evaluation and Expectation, Context refers to observations given by either the news writer or an actor in order to provide additional information that help explain or clarify details surrounding an Action. Based on our observations, sentences that fall under this category do not have any defining linguistic features (unlike Evaluation and Expectation as described above), except for the prevalent use of co-referring mentions. We detected this by checking if definite noun phrases appear as either the subject or object of sentences.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.7">Previous episode.</head><p>Sentences pertain to a Previous episode if they describe any event that happened prior to an Action, in the not-so-distant (or near) past. The main verbs of such sentences are often in either the past or past perfect tense. Additionally, relative temporal expressions pertaining to recent points in time (e.g., "last week", "previously", "on Friday") also tend to be used in specifying the time of occurrence of events falling under Previous episode.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.8">History.</head><p>Similar to Previous episode, History describes events that happened prior to an Action, but before the near past. Sentences that belong to this category typically have main verbs in either the past or past perfect tense. They also describe events whose time of occurrence are mentioned in the form of absolute temporal expressions (e.g., "in 1989"). However, relative temporal expressions may also be used, although these would pertain to a point in time from the distant past (e.g., "three decades ago").</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Preliminary Results</head><p>In implementing the heuristics for capturing linguistic devices that were discussed in the previous section, a pipeline for preprocessing was developed, based on three tools. Firstly, we made use of the LingPipe sentence splitter<ref type="foot" target="#foot_1">2</ref> , to automatically segment news text into individual sentences. Each sentence is then decomposed into tokens by the GENIA Tagger<ref type="foot" target="#foot_2">3</ref> . The tokenised sentence is then given as input to the Enju Parser<ref type="foot" target="#foot_3">4</ref> , through which we obtained not only the part-of-speech (POS) tag and lemma for each token but also predicate-argument structures identifying the sentence's main verb and its arguments (i.e., subject and object).</p><p>We then developed (in Python) rules for analysing the preprocessing results, for each news schema category (as described in the previous section). These include: (1) checking for specific values of POS tags, e.g., to check for verb tense and for modal verbs; (2) matching lemmatised tokens in look-up lists, e.g., of uncertainty cues, graded adjectives; (3) checking for definite noun phrases and whether they act as the subject or object of a main verb; and (4) matching against regular expressions designed to capture absolute and relative temporal expressions.</p><p>Upon application on the post-2005 set of our annotated corpus (containing eight news stories), our heuristics for identifying sentences obtained an overall performance of 64% (over all eight news schema categories) in terms of F-score (precision = 70%, recall = 59%).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Future Work and Conclusion</head><p>In this paper, we presented our work on automatically analysing the structure of news stories according to news schema categories. While our preliminary results are encouraging, there is significant room for improvement. Recognising that the current version of our annotated corpus is limited in size, we shall a dedicate a large part of our immediate next step on expanding it. This will allow us to observe any further linguistic devices used in each news schema category, and in turn, to eventually extend our heuristics. Our annotated corpus will be made publicly available upon completion of this planned expansion. We then intend to investigate how our automatically assigned news schema categories can be used as features to inform event temporal relation extraction, in the way of automatically constructing event sequences.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.0.1">Acknowledgements</head><p>The research on which this article is based was partially funded by the Alliance Manchester Business School Strategic Investment Fund.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: News story "Students Defiant as Chinese University Cracks Down on Young Communists" (Javier C.Hernandez, The New York Times, 28 December 2018) annotated and visualised using the brat annotation tool.</figDesc><graphic coords="4,64.80,54.06,485.99,240.46" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Number of sentences in the annotated corpus, for each news schema category.</figDesc><graphic coords="4,194.41,550.96,226.78,136.25" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0"><head></head><label></label><figDesc></figDesc><graphic coords="2,89.10,54.07,437.39,268.68" type="bitmap" /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://www.lexisnexis.com/uk/legal/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">http://alias-i.com/lingpipe/index.html</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">http://www.nactem.ac.uk/GENIA/tagger/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">http://www.nactem.ac.uk/enju/</note>
		</body>
		<back>
			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The students are part of a small but tenacious group of young communists using leftist ideology to shine a light on labor abuses across China and to call for better protections for the working class.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Evaluation</head><p>An assessment of the significance of the Action</p><p>The stern reaction by the authorities reflects the party's deep anxieties about the young communists and their unusual campaign.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Expectation</head><p>A view on what could happen after the Action Party leaders may be concerned that the 30th anniversary of the massacre, coming up in June, could inspire new protests.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Previous episode</head><p>Events that happened more recently (in the near past)</p><p>The protest on Friday came after Peking University officials tried to block a Marxist student group from organizing a celebration for Mao's 125th birthday.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>History</head><p>Set of events that happened before the near past</p><p>The party has long feared student-led protests, especially since the 1989 pro-democracy movement, which had deep student involvement and was crushed in a bloody crackdown around Tiananmen Square.</p></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">The Language of News Media</title>
		<author>
			<persName><forename type="first">Allan</forename><surname>Bell</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1991">1991</date>
			<publisher>Blackwell</publisher>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="page" from="147" to="174" />
			<pubPlace>Oxford, UK</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Subtopic annotation in a corpus of news texts: Steps towards automatic subtopic segmentation</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">F</forename><surname>Paula</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Maite</forename><surname>Cardoso</surname></persName>
		</author>
		<author>
			<persName><surname>Taboada</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">S</forename><surname>Thiago</surname></persName>
		</author>
		<author>
			<persName><surname>Pardo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 9th Brazilian Symposium in Information and Human Language Technology</title>
				<meeting>the 9th Brazilian Symposium in Information and Human Language Technology</meeting>
		<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Structures of news in the press</title>
		<author>
			<persName><forename type="first">A</forename><surname>Teun</surname></persName>
		</author>
		<author>
			<persName><surname>Van Dijk</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Discourse and Communication: New Approaches to the Analysis of Mass Media Discourse and Communication</title>
				<imprint>
			<date type="published" when="1985">1985</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<author>
			<persName><forename type="first">Jane</forename><surname>Elliott</surname></persName>
		</author>
		<title level="m">Using Narrative in Social Research</title>
				<meeting><address><addrLine>London, UK</addrLine></address></meeting>
		<imprint>
			<publisher>SAGE Publications Ltd</publisher>
			<date type="published" when="2005">2005</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="2" to="16" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Texttiling: Segmenting text into multi-paragraph subtopic passages</title>
		<author>
			<persName><forename type="first">Marti</forename><forename type="middle">A</forename><surname>Hearst</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computational Lingustics</title>
		<imprint>
			<biblScope unit="volume">23</biblScope>
			<biblScope unit="issue">1</biblScope>
			<date type="published" when="1997">1997</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Feature-based segmentation of narrative documents</title>
		<author>
			<persName><forename type="first">David</forename><surname>Kauchak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Francine</forename><surname>Chen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing, FeatureEng &apos;05</title>
				<meeting>the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing, FeatureEng &apos;05<address><addrLine>Stroudsburg, PA, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2005">2005</date>
			<biblScope unit="page" from="32" to="39" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Narrative Analysis</title>
		<author>
			<persName><forename type="first">William</forename><surname>Labov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Joshua</forename><surname>Waletzky</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Essays on the Verbal and Visual Arts: Proceedings of the 1966 Annual Spring Meeting of the American Ethnological Society</title>
				<meeting><address><addrLine>Seattle, USA</addrLine></address></meeting>
		<imprint>
			<publisher>University of Washington Press</publisher>
			<date type="published" when="1967">1967</date>
			<biblScope unit="page" from="12" to="44" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Discourse cues for broadcast news segmentation</title>
		<author>
			<persName><forename type="first">T</forename><surname>Mark</surname></persName>
		</author>
		<author>
			<persName><surname>Maybury</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics -Volume 2, ACL &apos;98/COLING &apos;98</title>
				<meeting>the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics -Volume 2, ACL &apos;98/COLING &apos;98<address><addrLine>Stroudsburg, PA, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="1998">1998</date>
			<biblScope unit="page" from="819" to="822" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Broadcast news navigation using story segmentation</title>
		<author>
			<persName><forename type="first">Andrew</forename><surname>Merlino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Daryl</forename><surname>Morey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mark</forename><surname>Maybury</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Fifth ACM International Conference on Multimedia, MULTIMEDIA &apos;97</title>
				<meeting>the Fifth ACM International Conference on Multimedia, MULTIMEDIA &apos;97<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="1997">1997</date>
			<biblScope unit="page" from="381" to="391" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">The penn discourse treebank 2.0</title>
		<author>
			<persName><surname>Pdl + ; Rashmi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nikhil</forename><surname>Prasad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alan</forename><surname>Dinesh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Eleni</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Livio</forename><surname>Miltsakaki</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Aravind</forename><forename type="middle">K</forename><surname>Robaldo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Bonnie</forename><forename type="middle">L</forename><surname>Joshi</surname></persName>
		</author>
		<author>
			<persName><surname>Webber</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">LREC</title>
				<imprint>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">News story segmentation in multiple modalities</title>
		<author>
			<persName><forename type="first">G</forename><surname>Poulisse</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Moens</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Dekens</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Seventh International Workshop on Content-Based Multimedia Indexing</title>
				<imprint>
			<date type="published" when="2009-06">2009. June 2009</date>
			<biblScope unit="page" from="25" to="32" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Towards Annotating Narrative Segments</title>
		<author>
			<persName><forename type="first">Nils</forename><surname>Reiter</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities</title>
				<meeting>the 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities<address><addrLine>LaTeCH; Stroudsburg, PA, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2015">2015. 2015</date>
			<biblScope unit="page" from="34" to="38" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Story segmentation of brodcast news in english, mandarin and arabic</title>
		<author>
			<persName><forename type="first">Andrew</forename><surname>Rosenberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Julia</forename><surname>Hirschberg</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers, NAACL-Short &apos;06</title>
				<meeting>the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers, NAACL-Short &apos;06<address><addrLine>Stroudsburg, PA, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2006">2006</date>
			<biblScope unit="page" from="125" to="128" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<title level="m" type="main">Collins Cobuild Grammar Patterns 2: Nouns and Adjectives</title>
		<author>
			<persName><forename type="first">John</forename><surname>Sinclair</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1998">1998</date>
			<publisher>Harper Collins Publishers</publisher>
			<pubPlace>London, UK</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">BRAT: A Web-based Tool for NLP-assisted Text Annotation</title>
		<author>
			<persName><surname>Spt + ; Pontus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sampo</forename><surname>Stenetorp</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Goran</forename><surname>Pyysalo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Tomoko</forename><surname>Topić</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sophia</forename><surname>Ohta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jun'ichi</forename><surname>Ananiadou</surname></persName>
		</author>
		<author>
			<persName><surname>Tsujii</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics, EACL &apos;12</title>
				<meeting>the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics, EACL &apos;12<address><addrLine>Stroudsburg, PA, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="102" to="107" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Spoken and written news story segmentation using lexical chains</title>
		<author>
			<persName><forename type="first">Nicola</forename><surname>Stokes</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: Proceedings of the HLT-NAACL 2003 Student Research Workshop -Volume 3, NAACLstudent &apos;03</title>
				<meeting>the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: the HLT-NAACL 2003 Student Research Workshop -Volume 3, NAACLstudent &apos;03<address><addrLine>Stroudsburg, PA, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2003">2003</date>
			<biblScope unit="page" from="49" to="54" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<author>
			<persName><surname>Teun Van Dijk</surname></persName>
		</author>
		<title level="m">News Analysis: Case Studies of International and National News in the Press</title>
				<meeting><address><addrLine>Hillside, New Jersey, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Lawrence Erlbaum Associates</publisher>
			<date type="published" when="1998">1998</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<title level="m" type="main">Uncertainty Detection in Natural Language Texts</title>
		<author>
			<persName><forename type="first">Veronika</forename><surname>Vincze</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2014">7 2014</date>
			<pubPlace>Szeged, Hungary</pubPlace>
		</imprint>
		<respStmt>
			<orgName>Doctoral School in Computer Science, University of Szeged</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">PhD thesis</note>
</biblStruct>

<biblStruct xml:id="b18">
	<monogr>
		<author>
			<persName><forename type="first">Aimee</forename><surname>Whitman</surname></persName>
		</author>
		<ptr target="https://ctb.ku.edu/en/table-of-contents/advocacy/media-advocacy/news-stories-media-wants/main" />
		<title level="m">The Community Toolbox: Creating News Stories the Media Wants</title>
				<imprint>
			<date type="published" when="2019-01-30">2019-01-30</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<title level="m" type="main">Whose Story Is It Anyway? Automatic Extraction of Accounts from News Articles</title>
		<author>
			<persName><forename type="first">Hao</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Frank</forename><surname>Boons</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Riza</forename><surname>Batista-Navarro</surname></persName>
		</author>
		<imprint>
			<publisher>Information Processing and Management</publisher>
		</imprint>
	</monogr>
	<note>In press</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
