=Paper= {{Paper |id=None |storemode=property |title=Coreference based event-argument relation extraction on biomedical text |pdfUrl=https://ceur-ws.org/Vol-714/Paper11_Yoshikawa.pdf |volume=Vol-714 |dblpUrl=https://dblp.org/rec/conf/smbm/YoshikawaRHAM10 }} ==Coreference based event-argument relation extraction on biomedical text== https://ceur-ws.org/Vol-714/Paper11_Yoshikawa.pdf

Coreference Based Event-Argument Relation Extraction
on Biomedical Text

Katsumasa Yoshikawa Sebastian Riedel Tsutomu Hirao
NAIST, Japan University of Massachusetts, Amherst NTT CS Lab, Japan
katsumasa-y@is.naist.jp sebastian.riedel@gmail.com hirao@cslab.kecl.ntt.co.jp

Masayuki Asahara Yuji Matsumoto
NAIST, Japan NAIST, Japan
masayu-a@is.naist.jp matsu@is.naist.jp

Abstract
This paper presents a new approach that
exploits coreference information to ex-
tract event-argument (E-A) relations from
biomedical documents. This approach has
two advantages: (1) it can extract a large
number of valuable E-A relations based on
the concept of salience in discourse (Grosz
et al., 1995) ; (2) it enables us to iden- Figure 1: Cross-Sentence Event-Argument Rela-
tify E-A relations over sentence bound- tion Extraction
aries (cross-links) using transitivity involv- ble information from other sentences in the same
ing coreference relations. We propose document we may be able to exploit. In partic-
two coreference-based models: a pipeline ular, no one has yet considered using coreference
based on Support Vector Machine (SVM) information to improve event extraction. Here we
classifiers, and a joint Markov Logic Net- propose a new approach to extract event-argument
work (MLN). We show the effectiveness of (E-A) relations that does make use of coreference
these models on a biomedical event corpus. information.
The both models outperform the systems Our approach includes two main ideas:
without coreference information. When
compared with the two models, joint MLN 1. aggressively extracting coreferent arguments
outperforms pipeline SVM with gold coref- based on salience in discourse
erence information. 2. predicting arguments crossing sentence bound-
aries by transitivity.
1 Introduction
First, when considering discourse structure
The increasing amount of biomedical texts gen- based on Centering Theory (Grosz et al., 1995),
erated by high throughput experiments demands to arguments which are coreferent to something (e.g.
extract useful information automatically by Natural “The region”) have higher salience in discourse.
Language Processing techniques. One of the more They are hence more likely to be arguments of
recent information extraction tasks is biomedical events mentioned in the document. Using this in-
event extraction. With the introduction of the GE- formation helps us to identify the right arguments
NIA Event Corpus (Kim et al., 2008) and the for candidate events and increase the likelihood of
BioNLP’09 shared task data (Kim et al., 2009), a extracting arguments with antecedents correspond-
set of documents annotated with events and their ing to the Arrow (A) in Figure 1. Note that iden-
arguments, various approaches for event extraction tifying coreferent arguments is not just important
have been proposed so far (Björne et al., 2009; to increase F1 score on the dataset: assuming that
Buyko et al., 2009; Poon and Vanderwende, 2010). salience in discourse indicates the novel informa-
However, previous work has only considered the tion the author wants to convey, it is the set of
problem on a per-sentence basis, neglecting possi- coreferent arguments we should extract at any cost.

90
Secondly, previous work on this task has primar- extraction and some issues; Section 3 explains our
ily focused on identifying event-arguments within proposed approach; Section 4 introduces our ex-
the same sentence. See Figure 1 for an example perimental setup; Section 5 presents results of our
of such cross-sentence event-argument relations. It experiments; and in Section 6 we conclude and
illustrates an example of E-A relation extraction in- present some ideas for future work.
cluding cross-sentence E-A. In the sentence S2, we
2 Event-Argument Relation Extraction
have “inducible” as an event to be identified. When
and the Issues of Previous Work
identifying intra-sentence arguments in S2, we can
obtain “The region” as Theme and “both interfer-
ons” as Cause. However, in this example, “The
region” is not sufficient as a Theme because “The
region” is coreferent to “The IRF-2 promoter re- Figure 2: An Example of Biomedical Event Ex-
gion” in S1. Thus, the true Theme of “inducible” traction
is “The IRF-2 promoter region” and this phrase is
actually more informative as an argument. On the 2.1 Biomedical Event Extraction
other hand, “The region” is just an anaphor of the Event extraction on biomedical text involves
true argument. Transitivity idea 1 allows us to ex- the three sub-tasks; identification of event trigger
tract cross-sentence E-A relations such as the Ar- words; classification of event types; extraction of
row (C) in Figure 1. the relations between events and arguments (E-A).
To implement both ideas we propose two models Figure 2 shows an example of event extraction. In
to extract event-argument (E-A) relations involv- this example, we have three event triggers: “induc-
ing coreference information. One is based on local tion”, “increases”, and “binding”. The correspond-
classification with SVMs, and another is based on ing event types are Positive regulation (Pos reg)
a joint Markov Logic Network (MLNs). To remain for “induction” and “increases”, and Binding for
efficient, and akin to existing approaches, both look “binding”. In Figure 2, “increases” has two argu-
for events on a per-sentence basis. However, in ments; “induction” and “binding”. The roles we
contrast to previous work, our models consider as have to identify fall into two classes: “Theme” and
candidate arguments not only the tokens of the cur- “Cause”. In the case of our example the roles be-
rent sentence, but also all tokens in the previous tween “increases” and the two arguments are Cause
sentences that are identified as antecedents of some and Theme, respectively.
tokens in the current sentence. Note that biomedical corpora have large num-
We show the effectiveness of our models on a bers of nominal events. For example, in Figure
biomedical corpus. They enable us to extract cross- 2 the arguments of “increases” are both nominal
sentence E-A relations: we achieve an F1 score of events. Such events can be arguments of other
69.7% for our MLN model, and 54.1 % for the events, and they are often hard to be identified.
SVM pipeline. Moreover, with the idea of salience
2.2 Biomedical Corpora for Event Extraction
in discourse our coreference-based approach helps
us to improve intra-sentence E-A extraction, in par- There are two major corpora for biomedical
ticular when arguments have antecedents. In this event extraction. One is the GENIA Event Corpus
case adding gold coreference information to MLNs (GEC) (Kim et al., 2008), and the other is the data
improves F-score by 16.9%. of the BioNLP’09 shared task. 2 This data is in fact
derived from the GEC. There are some important
In place of gold coreference information, we
differences between both corpora.
also experiment with predicted coreferences from a
simple coreference resolver. Although the quality event type GEC has fine-grained event type anno-
of predicted coreference information is relatively tations (35 classes), while BioNLP’09 data focuses
poor, we show that using this information is still on only 9 event subclasses.
better than not using it at all.
The remainder of this paper is organized as fol- non-event argument BioNLP’09 data does not
lows: Section 2 describes previous work for event differentiate between protein, gene and RNA,
while the GEC corpus does.
1
e.g. If “The region” is a Theme of “inducible” and “The
2
region” is coreferent to “The IRF-2...”, then “The IRF-2...” is http://www-tsujii.is.s.u-tokyo.ac.jp/
also a Theme of “inducible”. GENIA/SharedTask/

91
coreference annotation Both GEC and Japanese newswire text corpus, 5 Taira et al. (2008)
BioNLP’09 corpora provide coreference an- do consider cross-sentence E-A extraction. How-
notations related to event extraction. However, ever, they directly extract cross-sentence links
in the case of the BioNLP’09 data coreference without considering coreference relations. In ad-
information primarily concerns protein names and dition, their approach is based on a pipeline of
abbreviations that follow in parenthesis. The GEC, SVM classifiers, and their achieved performance
on the other hand, provides proper cross-sentence on cross-sentence E-A extraction was generally
coreference. Moreover, the sheer number of low. 6
coreference annotations is much higher 3 . 2.4 The Direction of Our Work
For our work we choose the GEC, primarily be- We present a new approach that exploits coref-
cause of the amount and quality of coreference erence information for E-A relation extraction.
information it provides. This allows us to train Moreover, in contrast to previous work on the
a coreference resolver, as well as testing our hy- BioNLP’09 shared task we apply our models in a
potheses when gold coreference annotations are more realistic setting. Instead of relying on gold
available. A secondary reason to prefer GEC over protein annotations, we use a Named Entity tagger;
the BioNLP’09 corpus is its fine-grained annota- and instead of focusing on the coarse-grained anno-
tion. We believe that this setting is more realistic. tation of the BioNLP task, we work with the GEC
2.3 Issues of Previous Work corpus and its the fine-grained ontology.
Various approaches have been proposed for From now on, for brevity, we call cross-
event-argument relation extraction on biomedical sentence event-argument relations just “cross-
text. However, even the current state-of-the-art links” and intra-sentence event-argument relations
does not exploit coreference relations and focuses “intra-links”.
exclusively on intra-sentence E-A extraction. We propose two coreference-based models. One
For example, Björne et al. (2009) achieved the is an SVM based model that extracts intra-links
best results for Task 1 in the BioNLP’09 competi- first and then cross-links as a post-processing step.
tion 4 . However, they neglected all cross-sentence The other is a joint model defined with Markov
E-A. They also reported that they did try to detect Logic that jointly extracts intra-links and cross-
cross-sentence arguments directly without the use links and allows us to model salience of discourse
of coreference. But this approach did not lead to in a principled manner.
reasonable performance increase. 3 Coreference Based Approach
In BioNLP’09, Riedel et al. (2009) proposed a We have two ideas for incorporating coreference
joint Markov Logic Network to tackle the task, and information into E-A relation extraction.
achieved the best results for Task 2. Their sys- • Aggressively extracting valuable E-A relations
tem makes use of global features and constraints, based on “salience in discourse”
and performs event trigger and argument detection • Predicting cross-links by using “transitivity”
jointly. Poon and Vanderwende (2010) also ap- including coreference relations
plied Markov Logic and achieved competitive per-
According to these ideas, we propose two ap-
formance to the state-of-the-art result of (Björne
proaches. One is a pipeline model based on SVM
et al., 2009). However, in both cases no cross-
classifiers, and the other is a joint model based
sentence information is exploited.
Markov Logic.
To summarize, so far there is no research
Before we present these approaches in detail, let
within biomedical event extraction which ex-
us first describe coreference resolution as a pre-
ploits coreference relations and tackles cross-
processing step.
sentence E-A relation extraction. By contrast,
for predicate-argument relation extraction in a 3.1 Coreference Resolution
3
There are some previous work for coreference
Björne et al. (2009) also mentioned that coreference re-
lations could be helpful to cross-sentence E-A extraction but
resolution on biomedical domains (Yang et al.,
the necessary coreference annotation to train a coreference re- 2004; Su et al., 2008). However, in our work, we
solver is not presented in BioNLP’09 data. introduce a simple coreference resolver based on a
4
BioNLP’09 has three tasks 1, 2, and 3. Task 1 is core
5
event extraction and mandatory. Our work also focuses on http://cl.naist.jp/nldata/corpus/
6
Task 1. Low 20s% F1

92
Table 1: Used Local Features for SVM Pipeline and MLN Joint
SVM 1st phase SVM 2nd phase
Description event & eventT ype role (E-A) MLN predicate
Word Form X X word(i, w)
Part-of-Speech X X pos(i, p)
Word Stem X X stem(i, s)
Named Entity Tag X X ne(i, n)
Chunk Tag X X chunk(i, c)
In Event Dictionary X X dict(i, d)
Has Capital Letter X X capital(i)
Has Numeric Characters X X numeric(i)
Has Punctuation Characters X X punc(i)
Character Bigram X bigram(i, bi)
Character Trigram X trigram(i, tri)
Dependency label X X dep(i, j, d)
Labeled dependency path between tokens X path(i, j, pt)
Unlabeled dependency path between tokens X pathN L(i, j, pt)
Least common ancester of dependency path X lca(i, j, L)

pairwise coreference model (Soon et al., 2001) 7 . It weighted first-order logic formulae to instanti-
employs a binary classifier which classifies all pos- ate Markov Networks of repetitive structure. In
sible pairs of noun phrases into “corefer” or “not Markov Logic users design predicates and formu-
corefer”. Popular external resources like WordNet lae to describe their problem. Then they use soft-
do not work in biomedical domain. Hence, our re- ware packages such as Alchemy 8 and Markov the-
solver identifies coreference relations with only ba- beast 9 in order to perform inference and learning.
sic features such as word form, POS, and NE tag, It is difficult to construct Markov Logic Net-
and achieves 59.1 pairwise F1 on GEC evaluating works for joint E-A relation extraction and coref-
5-fold cross validation. erence resolution across a complete document.
3.2 SVM Pipeline Model Hence we follow the two strategies: (1) restric-
In our pipeline we apply the SVM model pro- tion of argument candidates based on coreference
posed by (Björne et al., 2009). Their original relations; (2) construction of a joint model which
model first extracts events and event types with jointly identifies intra-links and cross-links. Re-
a multi-class SVM (1st phase). Then it identi- stricting argument candidates helps us to construct
fies the relations between all pairs of event-proteins a very compact but effective model. A joint model
and event-events by another multi-class SVM (2nd enables us to simultaneously extract intra-links and
phase). Note that, on our setting, the 1st phase clas- cross-links and contributes to improve the perfor-
sifies event types into 36 classes (35 types + “Not- mance. In addition, we will see that this setup still
Event”). Moreover, while protein annotations were allows us to implement the idea of salience in dis-
given in the BioNLP’09 shared task, for the GEC course with global formulae in Markov Logic.
we have extract them using an NE tagger. The fea- 3.3.1 Predicate Definition
tures we used for the 1st and 2nd phases are sum- Our joint model is based on the model pro-
marized in the first and the second columns of Ta- posed by Riedel et al. (2009). We first define the
ble 1, respectively. predicates of the proposed Markov Logic Network
After identifying intra-links, our model deter- (MLN). There are three “hidden” predicates corre-
ministically attaches, for each intra-sentence argu- sponding to what the target information we want to
ment of an event, all antecedents inside/outside the extract.
sentence to the same event. Hence we implement Table 2: The Three Hidden Predicates
transitivity as a post-processing step. However, it event(i) token i is an event
is difficult for SVM pipeline to implement the idea eventType(i, t) token i is an event with type t
role(i, j, r) token i has an argument j with role r
of salience in discourse. We believe that a Markov
Logic model is preferable in this case. In this work, role is the primary hidden predicate
3.3 MLN Joint Model because role represents event-argument relations.
Markov Logic (Richardson and Domingos, Next we define observed predicates representing
2006) is an expressive template language that uses information that is available at both train and test
7 8
Yang et al. (2004) also built the same kind of resolver as http://alchemy.cs.washington.edu/
9
a baseline with the original coreference annotations http://code.google.com/p/thebeast/

93
time. We define corefer(i, j), which indicates that Transitivity (T). We also present an additional idea,
token i is coreferent to token j (they are in the same Feature Copy (FC).
entity cluster). corefer(i, j) obviously plays an im-
Salience in Discourse Again, an important advan-
portant role for our coreference-based joint model.
tage of our joint model with MLN is the implemen-
We list the remaining observed predicates in the
tation of “salience in discourse”. The entities men-
last column of Table 1.
tioned over and over again are hence important in
Our MLN is composed of several weighted for-
discourse structure and accordingly it is highly pos-
mulae that we divide into two classes. The first
sible for them to be arguments of some events.
class contains local formulae for event, eventType,
In order to implement this idea of salience in
and role. We say that a formula is local if it consid-
discourse, we add the Formula (SiD) in the first
ers only one atom hidden predicates. The formulae
row of Table 4. Formula (SiD) captures that if a
in the second class are global: they involve two or
token j is coreferent to another token k, there is
more atoms of hidden predicates. In our case they
at least one event related to token j. Our model
consider event, eventType, and role simultaneously.
with Formula (SiD) prefers coreferent arguments
3.3.2 Basic Local Formulae and aggressively connects them with events. In
Our local features are based on previous work addition, our coreference resolver always extracts
(Björne et al., 2009; Riedel et al., 2009) and listed coreference relations which are related to events,
in Table 1. We exploit two types of formula rep- since coreference annotations in GEC are always
resentation: “simple token property” and “link to- related to events.
kens property” defined by Riedel et al. (2009).
The first type of local formulae describes prop- Transitivity Another main concept is “transitivity”
erties of only one token and such properties are for intra/cross-link extraction. 11 As mentioned
represented by the predicates in the first section of earlier, the SVM pipeline enforces transitivity as
Table 1. The second type of local formulae rep- a post-processing step.
resents properties of token pairs and linked tokens For the MLN joint model, let us consider the ex-
property predicates (dep, path, pathN L, and lca) ample of Figure 1 again.
in the second section of Table 1. role(13, 11, Theme) ∧ corefer(11, 4)
3.3.3 Basic Global Formulae ⇒ role(13, 4, Theme)
Our global formulae are designed to enforce con- This formula denotes that, if an event “inducible”
sistency between the three hidden predicates and has “The region” as a Theme and “The region” is
are shown in Table 3. Riedel et al. (2009) presented coreferent to “The IRF-2 promoter region”, then
more global formulae for their model. However, “The IRF-2 promoter region” is also a Theme of
some of these do not work well for our task setting “inducible”. The three atoms, role(13, 11, Theme),
on the GENIA Event Corpus. We obtain the best corefer(11, 4), and role(13, 4, Theme) in this for-
results by only using global formulae for ensuring mula are respectively corresponding to the three
consistency of the hidden predicates. arrow edges (A), (B), and (C) in Figure 1. This
formula is generalized as Formula (T) shown in the
3.4 Using Coreference Information
second row of Table 4.
We explain our coreference-based approaches
The merit of using Formula (T) is that we can
with Figure 1. For our Markov Logic Network
take care of cross-links by only solving intra-links
let us describe the relations in Figure 1 with pred-
and using the associated coreference relations.
icates. First, the two intra-links in S2 are de-
Candidate arguments of cross-links are the only
scribed by role(13, 11, Theme) – Arrow (A) and
arguments which are coreferent to intra-sentence
role(13, 15, Cause) – Arrow (D) 10 . Next, we rep-
mentions (antecedents).
resent the coreference relation by corefer(11, 4) –
Bold Line (B). Finally, we express the cross-link as The improvement by Formula (T) depends on
role(13, 4, Theme) – Arrow (C). the performance of intra-link role(i, j, r) and coref-
With the example in Figure 1, we explain the two erence relation corefer(j, k). Clearly, this perfor-
main concepts : Salience in Discourse (SiD) and mance depends partially on the effectiveness of
Formula (T) formula above. It should also be clear
10
In these terms, phrasal arguments are driven by anchor
11
tokens which are the ROOT tokes on dependency subtrees of An antecedent of an argument is sometimes in a subordi-
the phrases nate clause within a same sentence

94
Table 3: Basic Global Formulae
Formula Description
event(i) ⇒ ∃t.eventT ype(i, t) If there is an event there should be an event type
eventT ype(i, t) ⇒ event(i) If there is an event type there should be an event
role(i, j, r) ⇒ event(i) If j plays the role r for i then i has to be an event
event(i) ⇒ ∃j.role(i, j, T heme) Every event relates to need at least one argument.

Table 4: Coreference Formulae
Symbol Name Formula Description
(SiD) Salience in Discourse corefer(j, k) ⇒ ∃i.role(i, j, r) ∧ event(i) If a token j is coreferent to another token k, there
is at least one event related to token j
(T) Transitivity role(i, j, r) ∧ corefer(j, k) ⇒ role(i, k, r) If j plays the role r for i and j is coreferent to k
then k also plays the role r for i
(FC) Feature Copy corefer(j, k) ∧ F (k, +f ) ⇒ role(i, j, r) If j is coreferent to k and k has feature f then j
plays the role r for i

that the improvement due to Formula (SiD) are also for dependency path features we apply Charniak-
affected by Formula (T) formula because it impacts Johnson reranking parser with Self-Training pars-
on ∃i.role(i, j, r) in Formula (SiD). Thus, the for- ing model 13 , and convert the results to dependency
mulae representing the Salience in Discourse and tree with pennconverter 14 . Learning and infer-
Transitivity interact with each other. ence algorithms for joint model are provided by
Markov thebeast 15 , a Markov Logic engine tai-
Feature Copy We implement additional usage of
lored for NLP applications. Our pipeline model
coreference information through “Feature Copy”.
employs SVM-struct 16 both in learning and test-
Anaphor arguments such as “The region” in Figure
ing. For coreference resolution, we also employ
1 are sometimes more difficult to be identified than
SVM-struct for binary classification.
“The IRF-2 promoter region” because of the lack
of basic features (e.g. POS). Feature Copy supple-
ments the features of an anaphor by adding the fea-
tures of its antecedent. According to the example
of Figure 1, the formula,
corefer(11, 4) ∧ word(4, “IRF-2”)
⇒ role(13, 11, Theme)
injects a word feature “IRF-2” to anaphor “The re- Figure 3: Figure of Experimental Setup
gion” in S2. Here word(i, w) represents a feature
Figure 3 shows a structure of our experimental
that the child token of the token i on the depen-
system. Our experiments perform the following
dency subtree is word w. To be exact, this for-
steps. (1) First we perform preprocessing (tagging
mula allows us to employ additional features of the
and parsing). (2) Then we perform coreference res-
antecedent to solve the link role(13, 11, Theme).
olution for all the documents and generate lists of
This formula is generalized as Formula (FC) in the
token pairs that are coreferent to each other. (3) fi-
last row of Table 4. In Formula (FC), F denotes
nally, we train the event extractors: SVM pipeline
the predicates which represent basic features such
(SVM) and MLN joint (MLN) involving corefer-
as word, POS, and NE tags of the tokens. For-
ence relations. We evaluate all systems using 5-
mula (FC) copies the features of cross-sentence ar-
fold cross validation on GEC.
guments (antecedents) to intra-sentence arguments
(anaphors). Feature Copy is not a novel idea but 5 Results
contributes to improve performance. The SVM
pipeline model also add the same features. In the following we will first show the results of
our models for event extraction with/without coref-
4 Experimental Setup
erence information. We will then present more de-
Let us summarise the data and tools we employ. tailed results concerning E-A relation extraction.
The data for our experiments is GENIA Event Cor- 13
http://www.cs.brown.edu/˜dmcc/
pus (GEC) (Kim et al., 2008). For feature gen- biomedical.html
eration, we employ the following tools. POS and 14
http://nlp.cs.lth.se/software/
NE tagging are performed with GENIA Tagger , 12 treebank_converter/
15
http://code.google.com/p/thebeast/
12 16
http://www-tsujii.is.s.u-tokyo.ac.jp/ http://www.cs.cornell.edu/People/tj/
GENIA/tagger/ svm_light/svm_struct.html

95
5.1 Impact of Coreference Based Approach 5.2 Detailed Results for Event-Argument Rela-
tion Extraction
Table 5: Results of Event Extraction (F1)
System Corefer event eventType role
Table 6 shows the three types of E-A relations
(a) SVM NONE 77.0 67.8 52.3 ( 0.0) we evaluate in detail.
(b) SVM SYS 77.0 67.8 53.6 (+1.3) Table 6: Three Types of Event-Argument
(b′ ) SVM GOLD 77.0 67.8 55.4 (+3.1) Type Description Edge in Figure 1
(c) MLN NONE 80.5 70.6 51.7 ( 0.0) Cross E-A relations crossing sen- Arrow (C)
(g) MLN SYS 80.8 70.8 53.8 (+2.1) tence boundaries (cross-link)
(g′ ) MLN GOLD 81.2 70.8 56.7 (+5.0) W-ANT Intra-sententence E-As Arrow (A)
(intra-link) with antecedents
We begin by showing the SVM and MLN re- Normal Neither Cross nor W-ANT
Arrow (D)
sults for event extraction in Table 5. We present
F1-values of event, eventType, and role (E-A re- They correspond to the arrows (A), (C), and (D)
lation). The three columns (event, eventType, and in Figure 1, respectively. We show the detailed re-
role) in Table 5 correspond to the hidden predicates sults of E-A relation extraction in Table 7. The all
in Table 2. scores shown in the table are F1-values.
Table 7: Results of E-A Relation Extraction (F1)
Let us consider the rows of (a)-(b) and (c)- System Corefer Cross W-ANT Normal
(g). They compare SVM and MLN approaches (a) SVM NONE 0.0 56.0 53.6
with and without the use of coreference informa- (b) SVM SYS 27.9 57.0 54.3
(b′ ) SVM GOLD 54.1 57.3 55.4
tion. The column “Corefer” indicates how to in- (c) MLN NONE 0.0 49.8 ( 0.0) 53.2
clude coreference information: “NONE”– without (d) MLN FC 0.0 51.5 (+1.7) 53.7
(e) MLN FC+SiD 0.0 54.6 (+4.8) 53.3
coreference; “SYS”– with coreference resolver; (f) MLN FC+T 36.7 51.7 (+1.9) 53.7
“GOLD”– with gold coreference annotations. (g) MLN FC+SiD+T 39.3 56.5 (+6.7) 54.3
(g′ ) MLN GOLD 69.7 66.7 (+16.9) 55.3
We note that adding coreference information
leads to 1.3 points F1 improvement for the SVM 5.2.1 SVM pipeline Model
pipeline, and a 2.1 points improvement for MLN The first part of Table 7 shows the results of the
joint. Both improvements are statistically signifi- SVM pipeline with/without coreference relations.
cant. 17 With gold coreference information, Sys- Systems (a), (b) and (b′ ) correspond to the first
tems (b′ ) and (g′ ) clearly achieve more significant three rows in Table 5, respectively. We note that the
improvements. SVM pipeline manages to extract cross-links with
Let us move on to the comparisons between an F1 score of 27.9 points with coreference infor-
SVM pipeline and MLN joint models. For mation from the resolver. The third row in Table
event and eventType we compare row (b) with 7 shows the results of the system with gold coref-
row (g) and observe that the MLN outperforms erence which is extended from System (b). With
the SVM. This is to be contrasted with results gold coreference, the SVM pipeline achieves 54.1
for the BioNLP‘09 shared task, where the SVM points for “Cross”. However, the improvement
model (Björne et al., 2009) outperformed the we get for “W-ANT” relations is small since the
MLN (Riedel et al., 2009). This contrast may stem SVM pipeline model employs only Feature Copy
from the fact that GEC events are more difficult and Transitivity concepts. In particular, it cannot
to extract due to a large number of event types directly exploit Salience in Discourse as a feature.
and lack of gold protein annotations, and hence lo-
5.2.2 MLN joint Model
cal models are more likely to make mistakes that
global consistency constraints can rule out. How does coreference help our MLN approach?
To answer this question, the second part of Table
For role extractions (E-A relation), SVM 7 shows the results of the following six systems.
pipeline and MLN joint show comparable results, The row (c) corresponds to the fourth row of Ta-
at least when not using coreference relations. How- ble 5 and shows results for the system that does
ever, when coreference information is taken into not exploit any coreference information. Systems
account, the MLN profits more. In fact, with (d)-(g) include Formula (FC). In the sixth (e) and
gold coreference annotations, the MLN outper- the seventh (f) rows, we show the scores of MLN
forms SVM pipeline by 1.3 points margin. joint with Formula (SiD) and Formula (T), respec-
tively. Our full joint model with both (SiD) and (T)
17
ρ < 0.01, McNemar’s test 2-tailed formulae comes in the eighth row (g). System (g′ )

96
is an extended system from System (g) with gold sentence E-A extraction.
coreference information. Finally, the potential for further improvement
By comparing Systems (d)(e)(f) with System (c), through coreference-based approach is limited by
we note that Feature Copy (FC), Salience in Dis- the performance on intra-links extraction. More-
course (SiD), and Transitivity (T) formulae all suc- over, we also observe that the 20% of cross-links
cessfully exploit coreference information. For “W- are cases of zero-anaphora. Here the utility of
ANT”, Systems (d) and (e) outperform System coreference information is naturally limited, and
(c), which establishes that both Feature Copy and our Formula (T) cannot come into effect due to
Salience in Discourse are sensible additions to an missing corefer(j, k) atoms.
MLN E-A extractor. On the other hand, for “Cross 6 Conclusion and Future Work
(cross-link)”, System (f) extracts cross-sentence E-
A relations, which demonstrates that Transitivity is In this paper we presented a novel approach to
important, too. Next, for cross-link, our full system event extraction with coreference relations. Our
(g) achieved 39.3 points F1 score and outperformed approach incorporates coreference relations with
System (c) with 6.7 points margin for “W-ANT”. two concepts of salience in discourse and transi-
The further improvements with gold coreference tivity. The coreferent arguments we focused on
are shown by our full system (g′ ). It achieved are generally valuable for document understand-
69.7 points for “Cross” and improved System (c) ing in terms of discourse structure and they should
by 16.9 points margin for “W-ANT”. be aggressively extracted. We proposed two mod-
els: SVM pipeline and MLN joint and they im-
5.2.3 SVM Pipeline VS MLN Joint proved the attachments of intra-sentence and cross-
The final evaluation compares SVM pipeline and sentence related to coreference relations. Further-
MLN joint models. Let us consider Table 7 again. more, we confirmed that the more improvements
When comparing System (a) with System (c), we of coreference resolution led to the higher perfor-
notice that the SVM pipeline (a) outperforms the mance of event-argument relation extraction.
MLN joint model in “W-ANT” without corefer- However, potential for further improvement
ence information. However, when comparing Sys- through coreference-based approach is limited by
tems (b) and (g) (using coreference information by the performance of intra-sentence links and zero-
the resolver), MLN result is very competitive for anaphora cases. To overcome this problem, we
“W-ANT”, 11.4 points better for “Cross”. plan to propose a collective approach for a whole
Furthermore, with gold coreference, the MLN document. Specifically, we are constructing a joint
joint (System (g′ ) outperforms the SVM pipeline model of coreference resolution and event extrac-
(System (b′ )) both in “Cross” and “W-ANT” by tion considering all tokens in a document based on
15.6 points margin and 9.4 points margin, respec- the idea of Narrative Schema (Chambers and Ju-
tively. This demonstrates that our MLN model rafsky, 2009). If we take into account of all tokens
will further improves extraction of cross-links and in a document at one time, we can consider vari-
intra-links with antecedents if we have a better ous relations between events (event chains) through
coreference resolver. anaphoric chains. But to implement such a joint
We believe that the reason for these results model by Markov Logic, we cannot escape from
are two crucial differences between the SVM and fighting against time and space complexities. So,
MLN models: we are investigating a reasonable approximation
• With Formula (SiD) in Table 4, MLN joint has for learning and inference of joint approaches.
more chances to extract “W-ANT” relations. It
also effects the first term of Formula (T). By
contrast, the SVM pipeline cannot easily model
the notion of salience in discourse and the ef-
fect from coreference is weak.
• Formula (T) of MLN is defined as a soft con-
straint. Hence, other formulae may reject a sug-
gested cross-link from Formula (T). The SVM
pipeline deterministically identifies cross-links
and is hence more prone to errors in the intra-

97
References Hirotoshi Taira, Sanae Fujita, and Masaaki Nagata.
Jari Björne, Juho Heimonen, Filip Ginter, Antti Airola, 2008. A japanese predicate argument structure analy-
Tapio Pahikkala, and Tapio Salakoski. 2009. Ex- sis using decision lists. In EMNLP ’08: Proceedings
tracting complex biological events with rich graph- of the Conference on Empirical Methods in Natural
based feature sets. In BioNLP ’09: Proceedings of Language Processing, pages 523–532, Honolulu, HI,
the Workshop on BioNLP, pages 10–18, Boulder, CO, USA. Association for Computational Linguistics.
USA. Association for Computational Linguistics. Xiaofeng Yang, Guodong Zhou, Jian Su, and Chew Lim
Ekaterina Buyko, Erik Faessler, Joachim Wermter, and Tan. 2004. Improving noun phrase coreference reso-
Udo Hahn. 2009. Event extraction from trimmed lution by matching strings. In Proceedings of 1st In-
dependency graphs. In BioNLP ’09: Proceedings of ternation Joint Conference of Natural Language Pro-
the Workshop on BioNLP, pages 19–27, Boulder, CO, cessing, pages 326–333.
USA. Association for Computational Linguistics.
Nathanael Chambers and Dan Jurafsky. 2009. Unsu-
pervised learning of narrative schemas and their par-
ticipants. In Proceedings of the Joint Conference of
the 47th Annual Meeting of the ACL and the 4th Inter-
national Joint Conference on Natural Language Pro-
cessing of the AFNLP, pages 602–610, Suntec, Sin-
gapore, August. Association for Computational Lin-
guistics.
Barbara J. Grosz, Scott Weinstein, and Aravind K.
Joshi. 1995. Centering: A framework for model-
ing the local coherence of discourse. Computational
Linguistics, 21:203–225.
Jin-Dong Kim, Tomoko Ohta, and Jun’ichi Tsujii.
2008. Corpus annotation for mining biomedi-
cal events from literature. BMC Bioinformatics,
9(1):10+.
Jin-Dong Kim, Tomoko Ohta, Sampo Pyysalo, Yoshi-
nobu Kano, and Jun’ichi Tsujii. 2009. Overview of
bionlp’09 shared task on event extraction. In BioNLP
’09: Proceedings of the Workshop on BioNLP, pages
1–9, Boulder, CO, USA. Association for Computa-
tional Linguistics.
Hoifung Poon and Lucy Vanderwende. 2010. Joint
inference for knowledge extraction from biomedi-
cal literature. In Human Language Technologies:
The 2010 Annual Conference of the North Ameri-
can Chapter of the Association for Computational
Linguistics, pages 813–821, Los Angeles, California,
June. Association for Computational Linguistics.
Matthew Richardson and Pedro Domingos. 2006.
Markov logic networks. Machine Learning, 62(1-
2):107–136.
Sebastian Riedel, Hong-Woo Chun, Toshihisa Takagi,
and Jun’ichi Tsujii. 2009. A markov logic approach
to bio-molecular event extraction. In BioNLP ’09:
Proceedings of the Workshop on BioNLP, pages 41–
49, Boulder, CO, USA. Association for Computa-
tional Linguistics.
Wee Meng Soon, Hwee Tou Ng, and Daniel
Chung Yong Lim. 2001. A machine learning ap-
proach to coreference resolution of noun phrases.
Computational Linguistics, 27(4):521–544.
Jian Su, Xiaofeng Yang, Huaqing Hong, Yuka Tateisi,
and Jun’ichi Tsujii. 2008. Coreference resolution in
biomedical texts: a machine learning approach. In
Michael Ashburner, Ulf Leser, and Dietrich Rebholz-
Schuhmann, editors, Ontologies and Text Mining
for Life Sciences : Current Status and Future Per-
spectives, number 08131 in Dagstuhl Seminar Pro-
ceedings, Dagstuhl, Germany. Schloss Dagstuhl -
Leibniz-Zentrum fuer Informatik, Germany.