<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Journalist's Questions</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Giveme5W1H: A Universal System for Extracting Main Events from News Articles</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Felix Hamborg</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Corinna Breitinger</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bela Gipp</string-name>
          <email>gipp@uni-wuppertal.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Konstanz</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Wuppertal</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <abstract>
        <p>Event extraction from news articles is a commonly required prerequisite for various tasks, such as article summarization, article clustering, and news aggregation. Due to the lack of universally applicable and publicly available methods tailored to news datasets, many researchers redundantly implement event extraction methods for their own projects. The journalistic 5W1H questions are capable of describing the main event of an article, i.e., by answering who did what, when, where, why, and how. We provide an in-depth description of an improved version of Giveme5W1H, a system that uses syntactic and domain-specific rules to automatically extract the relevant phrases from English news articles to provide answers to these 5W1H questions. Given the answers to these questions, the system determines an article's main event. In an expert evaluation with three assessors and 120 articles, we determined an overall precision of p=0.73, and p=0.82 for answering the first four W questions, which alone can sufficiently summarize the main event reported on in a news article. We recently made our system publicly available, and it remains the only universal opensource 5W1H extractor capable of being applied to a wide range of use cases in news analysis.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>CCS CONCEPTS</title>
      <p>• Computing methodologies → Information extraction •
Information systems → Content analysis and feature selection •
Information systems → Summarization</p>
      <p>Copyright © 2019 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).
1</p>
    </sec>
    <sec id="sec-2">
      <title>INTRODUCTION</title>
      <p>
        The extraction of a news article’s main event is an automated
analysis task at the core of a range of use cases, including news
aggregation, clustering of articles reporting on the same event,
and news summarization [
        <xref ref-type="bibr" rid="ref15 ref4">4, 15</xref>
        ]. Beyond computer science,
other disciplines also analyze news events, for example,
researchers from the social sciences analyze how news outlets
report on events in what is known as frame analyses [
        <xref ref-type="bibr" rid="ref13 ref14">13, 14</xref>
        ].
      </p>
      <p>
        Despite main event extraction being a fundamental task in
news analysis, no publicly available method exists that can be
applied to the diverse use cases mentioned to capably extract
explicit event descriptors from a given article [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Explicit
event descriptors are properties that occur in a text to describe
an event, e.g., the phrases in an article that enable a reader to
understand what the article is reporting on. The reliable
extraction of event-describing phrases also allows later analysis tasks
to use common natural language processing (NLP) methods,
such as TF-IDF and cosine similarity, including named entity
recognition (NER) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] and named entity disambiguation
(NERD) [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] to assess the similarity of two events.
State-of-theart methods for extracting events from articles suffer from
three main shortcomings [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. First, most approaches only
detect events implicitly, e.g., by employing topic modeling [
        <xref ref-type="bibr" rid="ref2 ref42">2, 42</xref>
        ].
Second, they are specialized for the extraction of task-specific
properties, e.g., extracting only the number of injured people in
an attack [
        <xref ref-type="bibr" rid="ref32 ref42">32, 42</xref>
        ]. Lastly, some methods extract explicit
descriptors, but are not publicly available, or are described in
insufficient detail to allow researchers to reimplement the
approaches [
        <xref ref-type="bibr" rid="ref34 ref45 ref47 ref48">34, 45, 47, 48</xref>
        ].
      </p>
      <p>
        Last year, we introduced Giveme5W1H in the form of a
poster abstract [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], which was at that time still an in-progress
prototype capable of extracting universally usable phrases that
answer the journalistic 5W1H questions, i.e., who did what,
when, where, why, and how (see Figure 1). This poster,
however, did not disclose or discuss the scoring mechanisms used
for determining the best candidate phrases during main event
extraction. In this paper, we describe in detail how the
improved version of Giveme5W1H extracts 5W1H phrases and we
describe the results of our evaluation of these improvements.
We also introduce an annotated data set, which we created to
train our system’s model to improve extraction performance.
The training data set is available in the online repository (see
Section 6) and can be used by other researchers to train their
own 5W1H approaches. This paper is relevant to researchers
and developers from various disciplines with the shared aim of
extracting and analyzing the main events that are being
reported on in articles.
      </p>
      <p>Taliban attacks German consulate in northern Afghan city of
Mazar-i-Sharif with truck bomb</p>
      <p>The death toll from a powerful Taliban truck bombing at the German
consulate in Afghanistan's Mazar-i-Sharif city rose to at least six
Friday, with more than 100 others wounded in a major militant assault.</p>
      <p>The Taliban said the bombing late Thursday, which tore a massive
crater in the road and overturned cars, was a "revenge attack" for US
air strikes this month in the volatile province of Kunduz that left 32
civilians dead. […] The suicide attacker rammed his explosives-laden
car into the wall […].</p>
      <p>
        Our objective is to devise an automated method for extracting
the main event being reported on by a given news article. For
this purpose, we exclude non-event-reporting articles, such as
commentaries or press reviews. First, we define the extracted
main event descriptors to be concise (requirement R1). This
means they must be as short as possible and contain only the
information describing the event, while also being as long as
necessary to contain all information of the event. Second, the
descriptors must be of high accuracy (R2). For this reason, we
give higher priority to extraction accuracy than execution
speed [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. We also defined that the developed system must
achieve a higher extraction accuracy than Giveme5W [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
Compared to Giveme5W, the system proposed in this paper not
only additionally extracts the ‘how’ answer, but its analysis
workflow is more semantics-oriented to address the issues of
the previous statistics- and syntax-based extraction. We also
publish the first annotated 5W1H dataset, which we use to
learn the optimal parameters. In the Giveme5W
implementation, the values were based on expert judgement.
      </p>
      <p>The presented system especially benefits: (1) social
scientists with limited programming knowledge, who would benefit
from ready-to-use main event extraction methods, and (2)
computer scientists who are welcome to modify or build on any
of the modular components of our system and use our test
collection and results as a benchmark for their implementations.
2</p>
    </sec>
    <sec id="sec-3">
      <title>RELATED WORK</title>
      <p>
        The extraction of 5W1H phrases from news articles is related
to closed-domain question answering, which is why some
authors call their approaches 5W1H question answering (QA)
systems. Hamborg et al. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] gave an in-depth overview of 5W1H
extraction systems. Thus, we only provide a brief summary of
the current state-of-the-art and focus this section on the
extraction of the ‘how’ phrases. Most systems focus only on the
extraction of 5W phrases without ‘how’ phrases (cf. [
        <xref ref-type="bibr" rid="ref34 ref47 ref48 ref9">9, 34, 47,
48</xref>
        ]). The authors of prior work do not justify this, but we
suspect two reasons.
      </p>
      <p>
        First, the ‘how’ question is particularly difficult to extract due
to its ambiguity, as we will explain later in this section. Second,
‘how’ (and ‘why’) phrases are considered less important in
many use cases when compared to the other phrases,
particularly those answering the ‘who’, ‘what’, ‘when’, and ‘where’
(4W) questions (cf. [
        <xref ref-type="bibr" rid="ref21 ref40 ref49">21, 40, 49</xref>
        ]). For the sake of readability in
this section, we will also include approaches that only extract
the 5Ws when referring to 5W1H extraction. Aside for the ‘how’
extraction, the analysis of approaches for 5W1H or
5W-extraction is generally the same.
      </p>
      <p>
        Systems for 5W1H QA on news texts typically perform three
tasks to determine the article’s main event [
        <xref ref-type="bibr" rid="ref45 ref47">45, 47</xref>
        ]: (1)
preprocessing, (2) phrase extraction [
        <xref ref-type="bibr" rid="ref10 ref25 ref36 ref47 ref48">10, 25, 36, 47, 48</xref>
        ], where for
instance linguistic rules are used to extract phrases candidates,
and (3) candidate scoring, which selects the best answer for
each question by employing heuristics, such as the position of
a phrase within the document. The input data to QA systems is
usually text, such as a full article including the headline, lead
paragraph, and main text [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ], or a single sentence, e.g., in news
ticker format [
        <xref ref-type="bibr" rid="ref48">48</xref>
        ]. Other systems use automatic speech
recognition (ASR) to convert broadcasts into text [
        <xref ref-type="bibr" rid="ref47">47</xref>
        ]. The outcomes
of the process are six textual phrases, one for each of the 5W1H
questions, which together describe the main event of a given
news text, as highlighted in Figure 1. Thus far, no systems have
been described in sufficient detail to allow for a
reimplementation by other researchers.
      </p>
      <p>
        Both the ‘why’ and ‘how’ question pose a particular
challenge in comparison to the other questions. As discussed by
Hamborg et al. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], determining the reason or cause (i.e. ‘why’)
can even be difficult for humans. Often the reason is unknown,
or it is only described implicitly, if at all [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Extracting the
‘how’ answer is also difficult, because this question can be
answered in many ways. To find ‘how’ candidates, the system by
Sharma et al. extracts the adverb or adverbial phrase within the
‘what’ phrase [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ]. The tokens extracted with this simplistic
approach detail the verb, e.g., “He drove quickly”, but do not
answer the method how the action was performed (cf. [
        <xref ref-type="bibr" rid="ref37">37</xref>
        ]), e.g.,
by ramming an explosive-laden car into the consulate (in the
example in Figure 1), which is a prepositional phrase. Other
approaches employ ML [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], but have not been devised for the
English language. In summary, few approaches exist that
extract ‘how’ phrases. The reviewed approaches provide no
details on their extraction method, and achieve poor results, e.g.,
they extract adverbs rather than the tool or the method by
which an action was performed (cf. [
        <xref ref-type="bibr" rid="ref22 ref24 ref36">22, 24, 36</xref>
        ]).
      </p>
      <p>
        None of the reviewed approaches output canonical or
normalized data. Canonical output is more concise and also less
ambiguous than its original textual form (cf. [
        <xref ref-type="bibr" rid="ref46">46</xref>
        ]), e.g.,
poly
      </p>
      <sec id="sec-3-1">
        <title>Python</title>
      </sec>
      <sec id="sec-3-2">
        <title>Interface</title>
      </sec>
      <sec id="sec-3-3">
        <title>RESTful API</title>
        <p>e
l
c
i
t
r
A
w
a
R</p>
      </sec>
      <sec id="sec-3-4">
        <title>Preprocessing</title>
        <sec id="sec-3-4-1">
          <title>Sentence Split. Canonicalization</title>
        </sec>
        <sec id="sec-3-4-2">
          <title>LP Tokenization</title>
        </sec>
        <sec id="sec-3-4-3">
          <title>SUTime</title>
          <p>eN POS &amp; Parsing
roC NER No mAIiDnAatim</p>
        </sec>
        <sec id="sec-3-4-4">
          <title>Coref. Res.</title>
        </sec>
        <sec id="sec-3-4-5">
          <title>Cache</title>
          <p>t
n
tsuoCm ircenhm
E</p>
        </sec>
      </sec>
      <sec id="sec-3-5">
        <title>Phrase Extraction</title>
        <sec id="sec-3-5-1">
          <title>Action</title>
        </sec>
        <sec id="sec-3-5-2">
          <title>Environment</title>
        </sec>
        <sec id="sec-3-5-3">
          <title>Cause</title>
        </sec>
        <sec id="sec-3-5-4">
          <title>Method</title>
        </sec>
      </sec>
      <sec id="sec-3-6">
        <title>Candidate Scoring</title>
        <sec id="sec-3-6-1">
          <title>Who &amp; what</title>
        </sec>
        <sec id="sec-3-6-2">
          <title>When</title>
        </sec>
        <sec id="sec-3-6-3">
          <title>Where Why How</title>
          <p>
            iebnd irgn
m co
o S
C
s
e
s
a
r
h
P
H
1
W
5
input / output
analysis process
3rd party libraries
semes, such as crane (animal or machine), have multiple
meanings. Hence, canonical data is often more useful for subsequent
analysis tasks (see Section 1). Phrases containing temporal
information or location information may be canonicalized, e.g., by
converting the phrases to dates or timespans [
            <xref ref-type="bibr" rid="ref38 ref7">7, 38</xref>
            ] or to
precise geographic positions [
            <xref ref-type="bibr" rid="ref29">29</xref>
            ]. Phrases answering the other
questions could be canonicalized by employing NERD on the
contained NEs, and then linking the NEs to concepts defined in
a knowledge graph, such as YAGO [
            <xref ref-type="bibr" rid="ref19">19</xref>
            ], or WordNet [
            <xref ref-type="bibr" rid="ref31">31</xref>
            ].
          </p>
          <p>
            While the evaluations of reviewed papers generally indicate
sufficient quality to be usable for news event extraction, e.g.,
the system by Yaman et al. achieved  1 = 0.85 on the Darpa
corpus from 2009 [
            <xref ref-type="bibr" rid="ref48">48</xref>
            ], the evaluations lack comparability for
two reasons. First, no gold standard exists for journalistic
5W1H question answering on news articles. A few datasets
exist for automated question answering, specifically for the
purpose of disaster tracking [
            <xref ref-type="bibr" rid="ref28 ref41">28, 41</xref>
            ]; However, these datasets are
so specialized to their own use cases that they cannot be
applied to the use case of automated journalistic question
answering. Another challenge to the evaluation of news event
extraction is that the evaluation data sets of previous papers are
no longer publicly available [
            <xref ref-type="bibr" rid="ref34 ref47 ref48">34, 47, 48</xref>
            ]. Second, previous
papers each used different quality measures, such as precision
and recall [
            <xref ref-type="bibr" rid="ref9">9</xref>
            ] or error rates [
            <xref ref-type="bibr" rid="ref47">47</xref>
            ].
3
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>GIVEME5W1H: DESCRIPTION OF METHODS AND SYSTEM</title>
      <p>
        Giveme5W1H is an open-source main event retrieval system for
news articles that addresses the objectives we defined in
Section 1. The system extracts 5W1H phrases that describe the
most defining characteristics of a news event, i.e., who did what,
when, where, why, and how. This section describes the analysis
workflow of Giveme5W1H, as shown in Figure 1. Giveme5W1H
can be accessed by other software as a Python library and via a
RESTful API. Due to its modularity, researchers can efficiently
adapt or replace components. For example, researchers can
integrate a custom parser or adapt the scoring functions tailored
to the characteristics of their data. The system builds on
Giveme5W [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], but improves extraction performance by
addressing the planned future work directions: Giveme5W1H
uses coreference resolution, question-specific semantic
distance measures, combined scoring of candidates, and extracts
phrases for the ‘how’ question. The values of the parameters
introduced in this section result from a semi-automated search
for the optimal configuration of Giveme5W1H using an
annotated learning dataset including a manual, qualitative revision
(see Section 3.5).
3.1
      </p>
    </sec>
    <sec id="sec-5">
      <title>Preprocessing of News Articles</title>
      <p>Giveme5W1H accepts as input the full text of a news article,
including headline, lead paragraph, and body text. The user can
specify these three components as one or separately.
Optionally, the article’s publishing date can be provided, which helps
Giveme5W1H parse relative dates, such as “yesterday at 1 pm”.</p>
      <p>During preprocessing, we use Stanford CoreNLP for sentence
splitting, tokenization, lemmatization, POS-tagging, full
parsing, NER (with Stanford NER’s seven-class model), and
pronominal and nominal coreference resolution. Since our main
goal is high 5W1H extraction accuracy (rather than fast
execution speed), we use the best-performing model for each of the
CoreNLP annotators, i.e., the ‘neural’ model if available. We use
the default settings for English in all libraries.</p>
      <p>After the initial preprocessing, we bring all NEs in the text
into their canonical form. Following from requirement R1,
canonical information is the preferred output of Giveme5W1H,
since it is the most concise form. Because Giveme5W1H uses
the canonical information to extract and score ‘when’ and
‘where’ candidates, we implement the canonicalization task
during preprocessing.</p>
      <p>
        We parse dates written in natural language into canonical
dates using SUTime [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. SUTime looks for NEs of the type date
or time and merges adjacent tokens to phrases. SUTime also
handles heterogeneous phrases, such as “yesterday at 1 pm”,
which consist not only of temporal NEs but also other tokens,
such as function words. Subsequently, SUTime converts each
temporal phrase into a standardized TIMEX3 [
        <xref ref-type="bibr" rid="ref44">44</xref>
        ] instance.
TIMEX3 defines various types, also including repetitive
periods. Since events according to our definition occur at a single
point in time, we only retrieve datetimes indicating an exact
time, e.g., “yesterday at 6pm”, or a duration, e.g., “yesterday”,
which spans the whole day.
      </p>
      <p>
        Geocoding is the process of parsing places and addresses
written in natural language into canonical geocodes, i.e., one or
more coordinates referring to a point or area on earth. We look
for tokens classified as NEs of the type location (cf. [
        <xref ref-type="bibr" rid="ref48">48</xref>
        ]). We
merge adjacent tokens of the same NE type within the same
sentence constituent, e.g., within the same NP or VP. Similar to
temporal phrases, locality phrases are often heterogeneous, i.e.,
they do not only contain temporal NEs but also function words.
Hence, we introduce a locality phrase merge range  where = 1,
to merge phrases where up to  where arbitrary NE tokens are
allowed between two location NEs. Lastly, we geocode the
merged phrases with Nominatim1, which uses free data from
OpenStreetMap.
      </p>
      <p>
        We canonicalize NEs of the remaining types, e.g., persons
and organizations, by linking NEs to concepts in the YAGO
graph [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] using AIDA [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. The YAGO graph is a
state-of-theart knowledge base, where nodes in the graph represent
semantic concepts that are connected to other nodes through
attributes and relations. The data is derived from other
well-established knowledge bases, such as Wikipedia, WordNet,
WikiData, and GeoNames [
        <xref ref-type="bibr" rid="ref39">39</xref>
        ].
3.2
      </p>
    </sec>
    <sec id="sec-6">
      <title>Phrase Extraction</title>
      <p>Giveme5W1H performs four independent extraction chains to
retrieve the article’s main event: (1) the action chain extracts
phrases for the ‘who’ and ‘what’ questions, (2) environment for
‘when’ and ‘where’, (3) cause for ‘why’, and (4) method for
‘how’.</p>
      <p>
        The action extractor identifies who did what in the article’s
main event. The main idea for retrieving ‘who’ candidates is to
collect the subject of each sentence in the news article.
Therefore, we extract the first NP that is a direct child to the sentence
in the parse tree, and that has a VP as its next right sibling (cf.
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]). We discard all NPs that contain a child VP, since such NPs
yield lengthy ‘who’ phrases. Take, for instance, this sentence:
“((NP) Mr. Trump, ((VP) who stormed to a shock election
victory on Wednesday)), ((VP) said it was […])”, where “who
stormed […]” is the child VP of the NP. We then put the NPs into
the list of ‘who’ candidates. For each ‘who’ candidate, we take
the VP that is the next right sibling as the corresponding ‘what’
candidate (cf. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]). To avoid long ‘what’ phrases, we cut VPs
after their first child NP, which long VPs usually contain.
However, we do not cut the ‘what’ candidate if the VP contains at
most  what,min = 3 tokens, and the right sibling to the VP’s child
NP is a prepositional phrase (PP). This way, we avoid short,
undescriptive ‘what’ phrases. For instance, in the simplified
example: “((NP) The microchip) ((VP) is ((NP) part) ((PP) of a wider
range of the company’s products)).”, the truncated VP “is part”
contains no descriptive information; Hence, our presented
rules prevent this truncation.
      </p>
      <p>The environment extractor retrieves phrases describing the
temporal and locality context of the event. To determine ‘when’
1 https://github.com/openstreetmap/Nominatim, v3.0.0
candidates, we take TIMEX3 instances from preprocessing.
Similarly, we take the geocodes as ‘where’ candidates.</p>
      <p>
        The cause extractor looks for linguistic features indicating a
causal relation within a sentence’s constituents. We look for
three types of cause-effect indicators (cf. [
        <xref ref-type="bibr" rid="ref25 ref26">25, 26</xref>
        ]): causal
conjunctions, causative adverbs, and causative verbs. Causal
conjunctions, e.g. “due to”, “result of”, and “effect of”, connect two
clauses, whereas the second clause yields the ‘why’ candidate.
For causative adverbs, e.g., “therefore”, “hence”, and “thus”, the
first clause yields the ‘why’ candidate. If we find that one or
more subsequent tokens of a sentence match with one of the
tokens adapted from Khoo et al. [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ], we take all tokens on the
right (causal conjunction) or left side (causative adverb) as the
‘why’ candidate.
      </p>
      <p>
        Causative verbs, e.g. “activate” and “implicate”, are
contained in the middle VP of the causative NP-VP-NP pattern,
whereas the last NP yields the ‘why’ candidate [
        <xref ref-type="bibr" rid="ref11 ref26">11, 26</xref>
        ]. For
each NP-VP-NP pattern we find in the parse-tree, we determine
whether the VP is causative. To do this, we extract the VP’s
verb, retrieve the verb’s synonyms from WordNet [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] and
compare the verb and its synonyms with the list of causative
verbs from Girju [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], which we also extended by their
synonyms (cf. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]). If there is at least one match, we take the last
NP of the causative pattern as the ‘why’ candidate. To reduce
false positives, we check the NP and VP for the causal
constraints for verbs proposed by Girju [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>
        The method extractor retrieves ‘how’ phrases, i.e., the
method by which an action was performed. The combined
method consists of two subtasks, one analyzing copulative
conjunctions, the other looking for adjectives and adverbs. Often,
sentences with a copulative conjunction contain a method
phrase in the clause that follows the copulative conjunction,
e.g., “after [the train came off the tracks]”. Therefore, we look
for copulative conjunctions compiled from [
        <xref ref-type="bibr" rid="ref33">33</xref>
        ]. If a token
matches, we take the right clause as the ‘how’ candidate. To
avoid long phrases, we cut off phrases longer than  how,max =
10 tokens. The second subtask extracts phrases that consist
purely of adjectives or adverbs (cf. [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ]), since these often
represent how an action was performed. We use this extraction
method as a fallback, since we found the copulative
conjunction-based extraction too restrictive in many cases.
3.3
      </p>
    </sec>
    <sec id="sec-7">
      <title>Candidate Scoring</title>
      <p>The last task is to determine the best candidate of each 5W1H.
The scoring consists of two sub-tasks. First, we score
candidates independently for each of the 5W1H questions. Second,
we perform a combined scoring where we adjust scores of
candidates of one question dependent on properties, e.g., position,
of candidates of other questions. For each question  , we use a
scoring function that is composed as a weighted sum of 
scoring factors:   = ∑ =−01  q,  q, , where  q, is the weight of the
scoring factor  q, .</p>
      <p>
        Giveme5W1H: A Universal System for Extracting
Main Events from News Articles
To score ‘who’ candidates, we define three scoring factors:
the candidate shall occur in the article (1) early and (2) often,
and (3) contain a named entity. The first scoring factor targets
the concept of the inverse pyramid [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]: news mention the most
important information, i.e., the main event, early in the article,
e.g., in the headline and lead paragraph, while later paragraphs
contain details. However, journalists often use so called hooks
to get the reader’s attention without revealing all content of the
article [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ]. Hence, for each candidate, we also consider the
frequency of similar phrases in the article, since the primary actor
involved in the main event is likely to be mentioned frequently
in the article. Furthermore, if a candidate contains a NE, we will
score it higher, since in news, the actors involved in events are
often NEs, e.g., politicians. Table 1 shows the weights and
scoring factors.
 pos( ) is the position measured in sentences of candidate 
within the document,  f( ) the frequency of phrases similar to
 in the document, and NE( ) = 1 if  contains an NE, else 0 (cf.
[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]). To measure  f( ) of the actor in candidate  , we use the
number of the actor’s coreferences, which we extracted during
coreference
resolution
(see
      </p>
      <p>Section 3.1).</p>
      <p>This
allows
Giveme5W1H to recognize and count name variations, as well
as pronouns. Due to the strong relation between agent and
action, we rank VPs according to their NPs’ scores. Hence, the
most likely VP is the sibling in the parse tree of the most likely
NP:  what =  who.</p>
      <p>
        We score temporal candidates according to four scoring
factors: the candidate shall occur in the article (1) early and (2)
often. It should also be (3) close to the publishing date of the
article, and (4) of a relatively short duration. The first two scoring
factors have the same motivation as in the scoring of ‘who’
candidates. The idea for the third scoring factor, close to the
publishing date, is that events reported on by news articles often
occurred on the same day or on the day before the article was
published. For example, if a candidate represents a date one or
more years in the past before the publishing date of the article,
the candidate will achieve the lowest possible score in the third
scoring factor. The fourth scoring factor prefers temporal
candidates that have a short duration, since events according to
our definition happen during a specific point in time with a
short duration. We logarithmically normalize the duration
factor between one minute and one month (cf. [
        <xref ref-type="bibr" rid="ref49">49</xref>
        ]). The resulting
scoring formula for a temporal candidate  is the sum of the
weighted scoring factors shown in Table 2.
      </p>
      <p>date of the news article  pub,  ( ) the duration in seconds of  ,
and the normalization constants emax ≈ 2.5Ms (one month in
seconds),  min = 60s, and  max ≈ 31Ms (one year).</p>
      <p>
        The scoring of location candidates follows four scoring
factors: the candidate shall occur (1) early and (2) often in the
article. It should also be (3) often geographically contained in
other location candidates and be (4) specific. The first two
scoring factors have the same motivation as in the scoring of ‘who’
and ‘when’ candidates. The second and third scoring factors
aim to (1) find locations that occur often, either by being similar
to others, or (2) by being contained in other location
candidates. The fourth scoring factor favors specific locations, e.g.,
Berlin, over broader mentions of location, e.g., Germany or
Europe. We logarithmically normalize the location specificity
between  min = 225 2 (a small property’s size) and  max =
530,000  2 (approx. the mean area of all countries [
        <xref ref-type="bibr" rid="ref43">43</xref>
        ]). We
discuss other scoring options in Section 5. The used weights
and scoring factors are shown in Table 3. We measure  f( ), the
number of similar mentions of candidate  , by counting how
many other candidates have the same Nominatim place ID. We
measure  e( ) by counting how many other candidates are
geographically contained within the bounding box of  , where
 ( ) is the area of the bounding box of  in square meters.
cur early in the document, and (2) their causal type shall be
reliable [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ]. The second scoring factor rewards causal types with
low ambiguity (cf. [
        <xref ref-type="bibr" rid="ref11 ref3">3, 11</xref>
        ]), e.g., “because” has a very high
likelihood that the subsequent phrase contains a cause [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. The
weighted scoring factors are shown in Table 4. The causal type
TC( ) = 1 if  is extracted due to a causal conjunction, 0.62 if it
starts with a causative RB, and 0.06 if it contains a causative VB
(cf. [
        <xref ref-type="bibr" rid="ref25 ref26">25, 26</xref>
        ]).
The scoring of method candidates uses three simple scoring
factors: the candidate shall occur (1) early and (2) often in the
news article, and (3) their method type shall be reliable. The
weighted scoring factors for method candidates are shown in
tions of a method phrase  f( ) by the term frequency
(including inflected forms) of its most frequent token (cf. [
        <xref ref-type="bibr" rid="ref45">45</xref>
        ]).
      </p>
      <p>The final sub-task in candidate scoring is combined scoring,
which adjusts scores of candidates of a single 5W1H question
depending on the candidates of other questions. To improve
the scoring of method candidates, we devise a combined
sentence-distance scorer. The assumption is that the method of
performing an action should be close to the mention of the
action. The resulting equation for a method candidate  given an
action candidate  is:
 how,new( ,  ) =  how( ) −  0
| pos( ) −  pos( )|
where  0 = 1 . Section 5 describes additional scoring
approaches.
3.4</p>
    </sec>
    <sec id="sec-8">
      <title>Output</title>
      <p>The highlighted phrases in Figure 1 are candidates extracted by
Giveme5W1H for each of the 5W1H event properties of the
shown article. Giveme5W1H enriches the returned phrases
with additional information that the system extracted for its
own analysis or during custom enrichment, with which users
can integrate their own preprocessing. The additional
information for each token is its POS-tag, parse-tree context, and NE
type if applicable. Enriching the tokens with this information


increases the efficiency of the overall analysis workflow in
which Giveme5W1H may be embedded, since later analysis
tasks can reuse the information.</p>
      <p>
        For
the
temporal
phrases
and
locality
phrases,
Giveme5W1H also provides their canonical forms, i.e., TIMEX3
instances and geocodes. For the news article shown in Figure 1,
the canonical form of the ‘when’ phrase represents the entire
day of November 10, 2016. The canonical geocode for the
‘where’ phrase represents the coordinates of the center of the
city Mazar-i-Sharif (36°42'30.8"N 67°07'09.7"E), where the
bounding box represents the area of the city, and further
information from OSM, such as a canonical name and place ID, which
uniquely identifies the place. Lastly, Giveme5W1H provides
linked YAGO concepts [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] for other NEs.
3.5
      </p>
    </sec>
    <sec id="sec-9">
      <title>Parameter Learning</title>
      <p>
        Determining the best values for the parameters introduced in
Section 3, e.g., weights of scoring factors, is a supervised ML
problem [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Since there is no gold standard for journalistic
5W1H extraction on news (see Section 2), we created an
annotated dataset.
      </p>
      <p>
        The dataset is available in the open-source repository (see
Section 6). To facilitate diversity in both content and writing
style, we selected 13 major news outlets from the U.S. and the
UK. We sampled 100 articles from the news categories politics,
disaster, entertainment, business and sports for November 6th
– 14th, 2016. We crawled the articles [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] and manually revised
the extracted information to ensure that it was free of
extraction errors.
      </p>
      <p>We asked three annotators (graduate IT students, aged
between 22 and 26) to read each of the 100 news articles and to
annotate the single most suitable phrase for each 5W1H
question. Finally, for each article and question, we combined the
annotations using a set of combination rules, e.g., if all phrases
were semantically equal, we selected the most concise phrase,
or if there was no agreement between the annotators, we
selected each annotator’s first phrase, resulting in three
semantically diverging but valid phrases. We also manually added a
TIMEX3 instance to each ‘when’ annotation, which was used by
the error function for ‘when’. The intercoder reliability was
ICRann = 0.81, measured using average pairwise percentage
agreement.</p>
      <p>
        We divided the dataset into two subsets for training (80%
randomly sampled articles) and testing (20%). To find the
optimal parameter values for our extraction method, we first
computed for each parameter configuration the mean error
(ME) on the training set. To measure the ME of a configuration,
we devised three error functions to measure the semantic
distance between candidate phrases and annotated phrases. For
the textual candidates, i.e., who, what, why, and how, we used
the Word Mover’s Distance (WMD) [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]. WMD is a
state-of-theart generic measure for semantic similarity of two phrases. For
‘when’ candidates, we computed the difference in seconds
between candidate and annotation. For ‘where’ candidates, we
computed the distance in meters between both coordinates.
We linearly normalized all measures.
      </p>
      <p>We then validated the 5% best performing configurations
on the test set and discarded all configurations that yielded a
significantly different ME. Finally, we selected the best
performing parameter configuration for each question.
4</p>
    </sec>
    <sec id="sec-10">
      <title>EVALUATION</title>
      <p>
        We use the same evaluation rules and procedure as described
by Hamborg et al. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] but employed a larger dataset of 120
news articles, which we sampled from the BBC dataset [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] in
order to conduct a survey with three assessors. The dataset
contains 24 news articles in each of the following categories:
business (Bus), entertainment (Ent), politics (Pol), sport (Spo),
and tech (Tec)). We asked the assessors to read one article at a
time. After reading each article, we showed the assessors the
5W1H phrases that had been extracted by the system and
asked them to judge the relevance of each answer on a 3-point
scale: non-relevant (if an answer contained no relevant
information, score  = 0), partially relevant (if only part of the
answer was relevant or if information was missing,  = 0.5), and
relevant (if the answer was completely relevant without
missing information,  = 1).
      </p>
      <p>
        Table 6 shows the mean average generalized precision
(MAgP), a score suitable for multi-graded relevance
assessments [
        <xref ref-type="bibr" rid="ref17 ref23">17, 23</xref>
        ]. MAgP was 0.73 over all categories and
questions. If only considering the first 4Ws, which the literature
considers as sufficient to represent an event (cf. [
        <xref ref-type="bibr" rid="ref21 ref40 ref49">21, 40, 49</xref>
        ]),
overall MAgP was 0.82.
Of the few existing approaches capable of extracting phrases
that answer all six 5W1H questions (see Section 2), only one
publication reported the results of an evaluation: the approach
developed by Khodra achieved a precision of 0.74 on
Indonesian articles [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. Others did not conduct any evaluation [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ] or
only evaluated the extracted ‘who’ and ‘what’ phrases of
Japanese news articles [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ].
      </p>
      <p>
        We also investigated the performance of systems that are
only capable of extracting 5W phrases. Our system achieves a
MAgP5W = 0.75 , which is 0.05 higher than the MAgP of
Giveme5W [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. We also compared the performance with other
systems, despite the difficulties mentioned by Hamborg et al.
[
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]: other systems were tested on non-disclosed datasets [
        <xref ref-type="bibr" rid="ref34 ref47 ref48">34,
47, 48</xref>
        ], they were translated from other languages [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ], they
were devised for different languages [
        <xref ref-type="bibr" rid="ref22 ref24 ref45">22, 24, 45</xref>
        ], or they used
different evaluation measures, such as error rates [
        <xref ref-type="bibr" rid="ref47">47</xref>
        ] or
binary relevance assessments [
        <xref ref-type="bibr" rid="ref48">48</xref>
        ], which are both not optimal
because of the non-binary relevance of 5W1H answers (cf.
[
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]). Finally, none of the related systems have been made
publicly available or have been described in sufficient detail to
enable a re-implementation, which was the primary motivation
for our research (see Section 1).
      </p>
      <p>
        Therefore, a direct comparison of the results and related
work was not possible. Compared to the fraction of correct 5W
answers by the best system by Parton et al. [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ], Giveme5W1H
achieves a 0.12 higher MAgP5W. The best system by Yaman et
al. achieved a precision  5 = 0.89 [
        <xref ref-type="bibr" rid="ref48">48</xref>
        ], which is 0.14 higher
than our MAgP5W and – as a rough approximation of the best
achievable precision [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] – surprisingly almost identical to the
ICR of our assessors.
      </p>
      <p>We found that different forms of journalistic presentation in
the five news categories of the dataset led to different
extraction performance. Politics articles, which yielded the best
performance, mostly reported on single events. The performance
on sports articles was unexpectedly high, even though they not
only report on single events but also are background reports or
announcements, for which event detection is more difficult.
Determining the ‘how’ in sports articles was difficult (MAgPhow =
0.51), since often articles implicitly described the method of an
event, e.g., how one team won a match, by reporting on multiple
key events during the match. Some categories, such as
entertainment and tech, achieved lower extraction performances,
mainly because they often contained much background
information on earlier events and the actors involved.
5</p>
    </sec>
    <sec id="sec-11">
      <title>DISCUSSION AND FUTURE WORK</title>
      <p>Most importantly, we plan to improve the extraction quality of
the ‘what’ question, being one of the important 4W questions.
We aim to achieve an extraction performance similar to the
performance of the ‘who’ extraction (MaGPwho = 0.91), since
both are strongly related. In our evaluation, we identified two
main issues: (1) joint extraction of optimal ‘who’ candidates
with non-optimal ‘what’ candidates and (2) cut-off ‘what’
candidates. In some cases (1), the headline contained a concise
‘who’ phrase but the ‘what’ phrase did not contain all
information, e.g., because it only aimed to catch the reader’s interest,
a journalistic hook (Section 2). We plan to devise separate
extraction methods for both questions. Thereby, we need to
ensure that the top candidates of both questions fit to each other,
e.g., by verifying that the semantic concept of the answer of
each question, e.g., represented by the nouns in the ‘who’
phrase, or verbs in the ‘what’ phrase, co-occur in at least one
sentence of the article. In other cases (2), our strategy to avoid
too detailed ‘what’ candidates (Section 3.2) cut off the relevant
information, e.g., “widespread corruption in the finance
ministry has cost it $2m”, in which the underlined text was cut off.
We will investigate dependency parsing and further syntax
rules, e.g., to always include the direct object of a transitive
verb.</p>
      <p>For ‘when’ and ‘where’ questions, we found that in some
cases an article does not explicitly mention the main event’s
date or location. The date of an event may be implicitly defined
by the reported event, e.g., “in the final of the Canberra Classic”.
The location may be implicitly defined by the main actor, e.g.,
“Apple Postpones Release of […]”, which likely happened at the
Apple headquarters in Cupertino. Similarly, the proper noun
“Stanford University” also defines a location. We plan to
investigate how we can use the YAGO concepts, which are linked to
NEs, to gather further information regarding the date and
location of the main event. If no date can be identified, the
publishing date of the article or the day before it might sometimes be a
suitable fallback date.</p>
      <p>Using the standardized TIMEX3 instances from SUTime is an
improvement (MAgPwhen = 0.78) over a first version, where
we used dates without a duration (MAgPwhen = 0.72).</p>
      <p>The extraction of ‘why’ and ‘how’ phrases was most
challenging, which manifests in lower extraction performances
compared to the other questions. One reason is that articles
often do not explicitly state a single cause or method of an event,
but implicitly describe this throughout the article, particularly
in sports articles (see Section 5). In such cases, NLP methods
are currently not advanced enough to find and abstract or
summarize the cause or method (see Section 3.3). However, we
plan to improve the extraction accuracy by preventing the
system from returning false positives. For instance, in cases where
no cause or method could be determined, we plan to introduce
a score threshold to prevent the system from outputting
candidates with a low score, which are presumably wrong. Currently,
the system always outputs a candidate if at least one cause or
method was found.</p>
      <p>To improve the performance of all textual questions, i.e.,
who, what, why, and how, we will investigate two approaches.
First, we want to improve measuring a candidate’s frequency,
an important scoring factor in multiple questions (Section 3.3).
We currently use the number of coreferences, which does not
include synonymous mentions. We plan to count the number of
YAGO concepts that are semantically related to the current
candidate. Second, we found that a few top candidates of the four
textual questions were semantically correct but only contained
a pronoun referring to the more meaningful noun. We plan to
add the coreference’s original mention to extracted answers.
6</p>
    </sec>
    <sec id="sec-12">
      <title>CONCLUSION</title>
      <p>In this paper, we proposed Giveme5W1H, the first open-source
system that extracts answers to the journalistic 5W1H
questions, i.e., who did what, when, where, why, and how, to describe
a news article’s main event. The system canonicalizes temporal
mentions in the text to standardized TIMEX3 instances,
locations to geocoordinates, and other NEs, e.g., persons and
organizations, to unique concepts in a knowledge graph. The system
uses syntactic and domain-specific rules to extract and score
phrases for each 5W1H question. Giveme5W1H achieved a
mean average generalized precision (MAgP) of 0.73 on all
questions, and an MAgP of 0.82 on the first four W questions (who,
what, when, and where), which alone can represent an event.
Extracting the answers to ‘why’ and ‘how’ performed more
poorly, since articles often only imply causes and methods.
Answering the 5W1H questions is at the core of understanding
any article, and thus an essential task in many research efforts
that analyze articles. We hope that redundant implementations
and non-reproducible evaluations can be avoided with
Giveme5W1H as the first universally applicable, modular, and
open-source 5W1H extraction system. In addition to benefiting
developers and computer scientists, our system especially
benefits researchers from the social sciences, for whom automated
5W1H extraction was previously not made accessible.</p>
      <p>Giveme5W1H and the datasets for training and evaluation
are available at:
https://github.com/fhamborg/Giveme5W1H</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Agence</surname>
            <given-names>France-Presse</given-names>
          </string-name>
          <year>2016</year>
          .
          <article-title>Taliban attacks German consulate in northern Afghan city of Mazar-i-Sharif with truck bomb</article-title>
          .
          <source>The Telegraph.</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Allan</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          et al.
          <year>1998</year>
          .
          <article-title>Topic detection and tracking pilot study: Final report</article-title>
          .
          <source>Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop</source>
          (
          <year>1998</year>
          ),
          <fpage>194</fpage>
          -
          <lpage>218</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Asghar</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <year>2016</year>
          .
          <article-title>Automatic Extraction of Causal Relations from Natural Language Texts: A Comprehensive Survey</article-title>
          .
          <source>arXiv preprint arXiv:1605</source>
          .
          <fpage>07895</fpage>
          . (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Best</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          et al.
          <year>2005</year>
          .
          <article-title>Europe media monitor</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Bird</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          et al.
          <year>2009</year>
          .
          <article-title>Natural language processing with Python: analyzing text with the natural language toolkit.</article-title>
          <string-name>
            <surname>O'Reilly Media</surname>
          </string-name>
          , Inc.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Burnham</surname>
            ,
            <given-names>K.P.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Anderson</surname>
            ,
            <given-names>D.R.</given-names>
          </string-name>
          <year>2002</year>
          .
          <article-title>Model selection and multimodel inference: a practical information-theoretic approach</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>A.X.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Manning</surname>
            ,
            <given-names>C.D.</given-names>
          </string-name>
          <year>2012</year>
          .
          <article-title>SUTime: A library for recognizing and normalizing time expressions</article-title>
          .
          <source>LREC</source>
          . iii (
          <year>2012</year>
          ),
          <fpage>3735</fpage>
          -
          <lpage>3740</lpage>
          . DOI:https://doi.org/10.1017/CBO9781107415324.004.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Christian</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          et al.
          <year>2014</year>
          .
          <article-title>The Associated Press stylebook and briefing on media law</article-title>
          . The Associated Press.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Das</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          et al.
          <year>2012</year>
          .
          <article-title>The 5W structure for sentiment summarizationvisualization-tracking</article-title>
          .
          <source>Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)</source>
          (
          <year>2012</year>
          ),
          <fpage>540</fpage>
          -
          <lpage>555</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Finkel</surname>
            ,
            <given-names>J.R.</given-names>
          </string-name>
          et al.
          <year>2005</year>
          .
          <article-title>Incorporating non-local information into information extraction systems by gibbs sampling</article-title>
          .
          <source>Proceedings of the 43rd annual meeting on association for computational linguistics</source>
          (
          <year>2005</year>
          ),
          <fpage>363</fpage>
          -
          <lpage>370</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Girju</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <year>2003</year>
          .
          <article-title>Automatic detection of causal relations for question answering</article-title>
          .
          <source>Proceedings of the ACL 2003 workshop on Multilingual summarization and question answering-Volume</source>
          <volume>12</volume>
          (
          <year>2003</year>
          ),
          <fpage>76</fpage>
          -
          <lpage>83</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Greene</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Cunningham</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <year>2006</year>
          .
          <article-title>Practical solutions to the problem of diagonal dominance in kernel document clustering</article-title>
          .
          <source>Proceedings of the 23rd international conference on Machine learning</source>
          (
          <year>2006</year>
          ),
          <fpage>377</fpage>
          -
          <lpage>384</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Hamborg</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          et al.
          <year>2019</year>
          .
          <article-title>Automated Identification of Media Bias by Word Choice and Labeling in News Articles</article-title>
          .
          <source>Proceedings of the ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL)</source>
          (
          <article-title>Urbana-Champaign, IL</article-title>
          , USA,
          <year>2019</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Hamborg</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          et al.
          <year>2018</year>
          .
          <article-title>Automated identification of media bias in news articles: an interdisciplinary literature review</article-title>
          .
          <source>International Journal on Digital Libraries</source>
          . (
          <year>2018</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>25</lpage>
          . DOI:https://doi.org/10.1007/s00799-018- 0261-y.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Hamborg</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          et al.
          <year>2018</year>
          .
          <article-title>Bias-aware news analysis using matrix-based news aggregation</article-title>
          .
          <source>International Journal on Digital Libraries</source>
          . (
          <year>2018</year>
          ). DOI:https://doi.org/10.1007/s00799-018-0239-9.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Hamborg</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          et al.
          <year>2018</year>
          .
          <article-title>Extraction of Main Event Descriptors from News Articles by Answering the Journalistic Five W and One H Questions</article-title>
          .
          <article-title>Proceedings of the ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL) (Fort Worth, Texas</article-title>
          , USA,
          <year>2018</year>
          ),
          <fpage>339</fpage>
          -
          <lpage>340</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Hamborg</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          et al.
          <year>2018</year>
          .
          <article-title>Giveme5W: Main Event Retrieval from News Articles by Extraction of the Five Journalistic W Questions</article-title>
          .
          <source>Proceedings of the iConference 2018</source>
          (
          <article-title>Sheffield</article-title>
          , UK,
          <year>2018</year>
          ),
          <fpage>355</fpage>
          -
          <lpage>356</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Hamborg</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          et al.
          <year>2017</year>
          .
          <article-title>news-please: A Generic News Crawler and Extractor</article-title>
          .
          <source>Proceedings of the 15th International Symposium of Information Science</source>
          (
          <year>2017</year>
          ),
          <fpage>218</fpage>
          -
          <lpage>223</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Hoffart</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          et al.
          <year>2011</year>
          .
          <article-title>Robust Disambiguation of Named Entities in Text</article-title>
          .
          <source>Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing</source>
          (
          <year>2011</year>
          ),
          <fpage>782</fpage>
          -
          <lpage>792</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Hripcsak</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Rothschild</surname>
            ,
            <given-names>A.S.</given-names>
          </string-name>
          <year>2005</year>
          .
          <article-title>Agreement, the F-measure, and reliability in information retrieval</article-title>
          .
          <source>Journal of the American Medical Informatics Association</source>
          .
          <volume>12</volume>
          ,
          <issue>3</issue>
          (
          <year>2005</year>
          ),
          <fpage>296</fpage>
          -
          <lpage>298</lpage>
          . DOI:https://doi.org/10.1197/jamia.M1733.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Ide</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          et al.
          <year>2005</year>
          .
          <article-title>TrackThem: Exploring a large-scale news video archive by tracking human relations</article-title>
          .
          <source>Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)</source>
          (
          <year>2005</year>
          ),
          <fpage>510</fpage>
          -
          <lpage>515</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Ikeda</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          et al.
          <year>1998</year>
          .
          <article-title>Information Classification and Navigation Based on 5W1 H of the Target Information</article-title>
          .
          <source>Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics-Volume</source>
          <volume>1</volume>
          (
          <year>1998</year>
          ),
          <fpage>571</fpage>
          -
          <lpage>577</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Kekäläinen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Järvelin</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <year>2002</year>
          .
          <article-title>Using graded relevance assessments in IR evaluation</article-title>
          .
          <source>Journal of the American Society for Information Science and Technology</source>
          .
          <volume>53</volume>
          ,
          <issue>13</issue>
          (
          <year>2002</year>
          ),
          <fpage>1120</fpage>
          -
          <lpage>1129</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Khodra</surname>
            ,
            <given-names>M.L.</given-names>
          </string-name>
          <year>2015</year>
          .
          <article-title>Event extraction on Indonesian news article using multiclass categorization</article-title>
          .
          <source>ICAICTA</source>
          <year>2015</year>
          - 2015
          <source>International Conference on Advanced Informatics: Concepts</source>
          ,
          <source>Theory and Applications</source>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Khoo</surname>
            ,
            <given-names>C.S.G.</given-names>
          </string-name>
          et al.
          <year>1998</year>
          .
          <article-title>Automatic extraction of cause-effect information from newspaper text without knowledge-based inferencing</article-title>
          .
          <source>Literary and Linguistic Computing</source>
          .
          <volume>13</volume>
          ,
          <issue>4</issue>
          (
          <year>1998</year>
          ),
          <fpage>177</fpage>
          -
          <lpage>186</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Khoo</surname>
            ,
            <given-names>C.S.G.</given-names>
          </string-name>
          <year>1995</year>
          .
          <article-title>Automatic identification of causal relations in text and their use for improving precision in information retrieval</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Kusner</surname>
            ,
            <given-names>M.J.</given-names>
          </string-name>
          et al.
          <year>2015</year>
          .
          <article-title>From Word Embeddings To Document Distances</article-title>
          .
          <source>Proceedings of The 32nd International Conference on Machine Learning</source>
          .
          <volume>37</volume>
          , (
          <year>2015</year>
          ),
          <fpage>957</fpage>
          -
          <lpage>966</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Lejeune</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          et al.
          <year>2015</year>
          .
          <article-title>Multilingual event extraction for epidemic detection</article-title>
          .
          <source>Artificial Intelligence in Medicine</source>
          . (
          <year>2015</year>
          ). DOI:https://doi.org/10.1016/j.artmed.
          <year>2015</year>
          .
          <volume>06</volume>
          .005.
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          et al.
          <year>2003</year>
          .
          <article-title>InfoXtract location normalization: a hybrid approach to geographic references in information extraction</article-title>
          .
          <source>Proceedings of the {HLTNAACL} 2003 Workshop on Analysis of Geographic References. 1</source>
          , (
          <year>2003</year>
          ),
          <fpage>39</fpage>
          -
          <lpage>44</lpage>
          . DOI:https://doi.org/10.3115/1119394.1119400.
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Mahdisoltani</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          et al.
          <year>2015</year>
          .
          <article-title>YAGO3: A Knowledge Base from Multilingual Wikipedias</article-title>
          .
          <source>Proceedings of CIDR</source>
          . (
          <year>2015</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>11</lpage>
          . DOI:https://doi.org/10.1016/j.jbi.
          <year>2013</year>
          .
          <volume>09</volume>
          .007.
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>Miller</surname>
            ,
            <given-names>G.A.</given-names>
          </string-name>
          <year>1995</year>
          .
          <article-title>WordNet: a lexical database for English</article-title>
          .
          <source>Communications of the ACM</source>
          .
          <volume>38</volume>
          ,
          <issue>11</issue>
          (
          <year>1995</year>
          ),
          <fpage>39</fpage>
          -
          <lpage>41</lpage>
          . DOI:https://doi.org/10.1145/219717.219748.
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <surname>Oliver</surname>
            ,
            <given-names>P.E.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Maney</surname>
            ,
            <given-names>G.M.</given-names>
          </string-name>
          <year>2000</year>
          .
          <article-title>Political Processes and Local Newspaper Coverage of Protest Events: From Selection Bias to Triadic Interactions</article-title>
          .
          <source>American Journal of Sociology</source>
          .
          <volume>106</volume>
          ,
          <issue>2</issue>
          (
          <year>2000</year>
          ),
          <fpage>463</fpage>
          -
          <lpage>505</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          <source>[33] Oxford English 2009. Oxford English Dictionary</source>
          . Oxford University Press.
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <surname>Parton</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          et al.
          <year>2009</year>
          . Who, what, when, where, why?:
          <article-title>comparing multiple approaches to the cross-lingual 5W task</article-title>
          .
          <source>Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1-Volume</source>
          <volume>1</volume>
          (
          <year>2009</year>
          ),
          <fpage>423</fpage>
          -
          <lpage>431</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <surname>Peters</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          et al.
          <year>2012</year>
          .
          <article-title>Improving the Hook in Case Writing</article-title>
          .
          <source>Journal of Case Studies. 30</source>
          , (
          <year>2012</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <surname>Sharma</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          et al.
          <year>2013</year>
          .
          <article-title>News Event Extraction Using 5W1H Approach &amp; Its Analysis</article-title>
          .
          <source>International Journal of Scientific &amp; Engineering Research - IJSER. 4</source>
          ,
          <issue>5</issue>
          (
          <year>2013</year>
          ),
          <fpage>2064</fpage>
          -
          <lpage>2067</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <surname>Sharp</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <year>2002</year>
          .
          <article-title>Kipling's guide to writing a scientific paper</article-title>
          .
          <source>Croatian medical journal. 43</source>
          ,
          <issue>3</issue>
          (
          <year>2002</year>
          ),
          <fpage>262</fpage>
          -
          <lpage>7</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <surname>Strötgen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Gertz</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <year>2013</year>
          .
          <article-title>Multilingual and cross-domain temporal tagging</article-title>
          .
          <source>Language Resources and Evaluation</source>
          .
          <volume>47</volume>
          ,
          <issue>2</issue>
          (
          <year>2013</year>
          ),
          <fpage>269</fpage>
          -
          <lpage>298</lpage>
          . DOI:https://doi.org/10.1007/s10579-012-9179-y.
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <surname>Suchanek</surname>
            ,
            <given-names>F.M.</given-names>
          </string-name>
          et al.
          <year>2007</year>
          .
          <article-title>YAGO: a core of semantic knowledge</article-title>
          .
          <source>Proceedings of the 16th international conference on World Wide Web</source>
          . (
          <year>2007</year>
          ),
          <fpage>697</fpage>
          -
          <lpage>706</lpage>
          . DOI:https://doi.org/10.1145/1242572.1242667.
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <surname>Sundberg</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Melander</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <year>2013</year>
          .
          <article-title>Introducing the UCDP Georeferenced Event Dataset</article-title>
          .
          <source>Journal of Peace Research</source>
          .
          <volume>50</volume>
          ,
          <issue>4</issue>
          (
          <year>2013</year>
          ),
          <fpage>523</fpage>
          -
          <lpage>532</lpage>
          . DOI:https://doi.org/10.1177/0022343313484347.
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <surname>Sundheim</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <year>1992</year>
          .
          <article-title>Overview of the fourth message understanding evaluation and conference</article-title>
          .
          <source>Proceedings of the 4th conference on Message understanding</source>
          (
          <year>1992</year>
          ),
          <fpage>3</fpage>
          -
          <lpage>21</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          [42]
          <string-name>
            <surname>Tanev</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          et al.
          <year>2008</year>
          .
          <article-title>Real-time news event extraction for global crisis monitoring</article-title>
          .
          <source>Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)</source>
          (
          <year>2008</year>
          ),
          <fpage>207</fpage>
          -
          <lpage>218</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          [43]
          <string-name>
            <surname>The</surname>
            <given-names>CIA</given-names>
          </string-name>
          World Factbook:
          <year>2016</year>
          . https://www.cia.gov/library/publications/the-world-factbook/geos/.
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          [44] TimeML Working Group 2009.
          <article-title>Guidelines for Temporal Expression Annotation for English for TempEval 2010</article-title>
          . English. (
          <year>2009</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>14</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          [45]
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          et al.
          <year>2010</year>
          .
          <article-title>Chinese news event 5w1h elements extraction using semantic role labeling</article-title>
          .
          <source>Information Processing (ISIP)</source>
          ,
          <source>2010 Third International Symposium on (2010)</source>
          ,
          <fpage>484</fpage>
          -
          <lpage>489</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          [46]
          <string-name>
            <surname>Wick</surname>
            ,
            <given-names>M.L.</given-names>
          </string-name>
          et al.
          <year>2008</year>
          .
          <article-title>A unified approach for schema matching, coreference and canonicalization</article-title>
          .
          <source>Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD</source>
          <volume>08</volume>
          (
          <year>2008</year>
          ),
          <fpage>722</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref47">
        <mixed-citation>
          [47]
          <string-name>
            <surname>Yaman</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          et al.
          <year>2009</year>
          .
          <article-title>Classification-based strategies for combining multiple 5-w question answering systems</article-title>
          .
          <source>INTERSPEECH</source>
          (
          <year>2009</year>
          ),
          <fpage>2703</fpage>
          -
          <lpage>2706</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref48">
        <mixed-citation>
          [48]
          <string-name>
            <surname>Yaman</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          et al.
          <year>2009</year>
          .
          <article-title>Combining semantic and syntactic information sources for 5-w question answering</article-title>
          .
          <source>INTERSPEECH</source>
          (
          <year>2009</year>
          ),
          <fpage>2707</fpage>
          -
          <lpage>2710</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref49">
        <mixed-citation>
          [49]
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          et al.
          <year>1998</year>
          .
          <article-title>A study on retrospective and on-line event detection</article-title>
          .
          <source>Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '98</source>
          (
          <year>1998</year>
          ),
          <fpage>28</fpage>
          -
          <lpage>36</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>