=Paper= {{Paper |id=None |storemode=property |title=Coreference based event-argument relation extraction on biomedical text |pdfUrl=https://ceur-ws.org/Vol-714/Paper11_Yoshikawa.pdf |volume=Vol-714 |dblpUrl=https://dblp.org/rec/conf/smbm/YoshikawaRHAM10 }} ==Coreference based event-argument relation extraction on biomedical text== https://ceur-ws.org/Vol-714/Paper11_Yoshikawa.pdf
            Coreference Based Event-Argument Relation Extraction
                             on Biomedical Text

    Katsumasa Yoshikawa          Sebastian Riedel            Tsutomu Hirao
        NAIST, Japan   University of Massachusetts, Amherst NTT CS Lab, Japan
     katsumasa-y@is.naist.jp           sebastian.riedel@gmail.com            hirao@cslab.kecl.ntt.co.jp


                  Masayuki Asahara                                   Yuji Matsumoto
                    NAIST, Japan                                      NAIST, Japan
                  masayu-a@is.naist.jp                               matsu@is.naist.jp


                    Abstract
   This paper presents a new approach that
   exploits coreference information to ex-
   tract event-argument (E-A) relations from
   biomedical documents. This approach has
   two advantages: (1) it can extract a large
   number of valuable E-A relations based on
   the concept of salience in discourse (Grosz
   et al., 1995) ; (2) it enables us to iden-            Figure 1: Cross-Sentence Event-Argument Rela-
   tify E-A relations over sentence bound-               tion Extraction
   aries (cross-links) using transitivity involv-        ble information from other sentences in the same
   ing coreference relations. We propose                 document we may be able to exploit. In partic-
   two coreference-based models: a pipeline              ular, no one has yet considered using coreference
   based on Support Vector Machine (SVM)                 information to improve event extraction. Here we
   classifiers, and a joint Markov Logic Net-            propose a new approach to extract event-argument
   work (MLN). We show the effectiveness of              (E-A) relations that does make use of coreference
   these models on a biomedical event corpus.            information.
   The both models outperform the systems                   Our approach includes two main ideas:
   without coreference information. When
   compared with the two models, joint MLN               1. aggressively extracting coreferent arguments
   outperforms pipeline SVM with gold coref-                based on salience in discourse
   erence information.                                   2. predicting arguments crossing sentence bound-
                                                            aries by transitivity.
1 Introduction
                                                            First, when considering discourse structure
  The increasing amount of biomedical texts gen-         based on Centering Theory (Grosz et al., 1995),
erated by high throughput experiments demands to         arguments which are coreferent to something (e.g.
extract useful information automatically by Natural      “The region”) have higher salience in discourse.
Language Processing techniques. One of the more          They are hence more likely to be arguments of
recent information extraction tasks is biomedical        events mentioned in the document. Using this in-
event extraction. With the introduction of the GE-       formation helps us to identify the right arguments
NIA Event Corpus (Kim et al., 2008) and the              for candidate events and increase the likelihood of
BioNLP’09 shared task data (Kim et al., 2009), a         extracting arguments with antecedents correspond-
set of documents annotated with events and their         ing to the Arrow (A) in Figure 1. Note that iden-
arguments, various approaches for event extraction       tifying coreferent arguments is not just important
have been proposed so far (Björne et al., 2009;         to increase F1 score on the dataset: assuming that
Buyko et al., 2009; Poon and Vanderwende, 2010).         salience in discourse indicates the novel informa-
  However, previous work has only considered the         tion the author wants to convey, it is the set of
problem on a per-sentence basis, neglecting possi-       coreferent arguments we should extract at any cost.


                                                    90
   Secondly, previous work on this task has primar-                extraction and some issues; Section 3 explains our
ily focused on identifying event-arguments within                  proposed approach; Section 4 introduces our ex-
the same sentence. See Figure 1 for an example                     perimental setup; Section 5 presents results of our
of such cross-sentence event-argument relations. It                experiments; and in Section 6 we conclude and
illustrates an example of E-A relation extraction in-              present some ideas for future work.
cluding cross-sentence E-A. In the sentence S2, we
                                                                   2 Event-Argument Relation Extraction
have “inducible” as an event to be identified. When
                                                                     and the Issues of Previous Work
identifying intra-sentence arguments in S2, we can
obtain “The region” as Theme and “both interfer-
ons” as Cause. However, in this example, “The
region” is not sufficient as a Theme because “The
region” is coreferent to “The IRF-2 promoter re-                   Figure 2: An Example of Biomedical Event Ex-
gion” in S1. Thus, the true Theme of “inducible”                   traction
is “The IRF-2 promoter region” and this phrase is
actually more informative as an argument. On the                   2.1 Biomedical Event Extraction
other hand, “The region” is just an anaphor of the                    Event extraction on biomedical text involves
true argument. Transitivity idea 1 allows us to ex-                the three sub-tasks; identification of event trigger
tract cross-sentence E-A relations such as the Ar-                 words; classification of event types; extraction of
row (C) in Figure 1.                                               the relations between events and arguments (E-A).
   To implement both ideas we propose two models                   Figure 2 shows an example of event extraction. In
to extract event-argument (E-A) relations involv-                  this example, we have three event triggers: “induc-
ing coreference information. One is based on local                 tion”, “increases”, and “binding”. The correspond-
classification with SVMs, and another is based on                  ing event types are Positive regulation (Pos reg)
a joint Markov Logic Network (MLNs). To remain                     for “induction” and “increases”, and Binding for
efficient, and akin to existing approaches, both look              “binding”. In Figure 2, “increases” has two argu-
for events on a per-sentence basis. However, in                    ments; “induction” and “binding”. The roles we
contrast to previous work, our models consider as                  have to identify fall into two classes: “Theme” and
candidate arguments not only the tokens of the cur-                “Cause”. In the case of our example the roles be-
rent sentence, but also all tokens in the previous                 tween “increases” and the two arguments are Cause
sentences that are identified as antecedents of some               and Theme, respectively.
tokens in the current sentence.                                       Note that biomedical corpora have large num-
   We show the effectiveness of our models on a                    bers of nominal events. For example, in Figure
biomedical corpus. They enable us to extract cross-                2 the arguments of “increases” are both nominal
sentence E-A relations: we achieve an F1 score of                  events. Such events can be arguments of other
69.7% for our MLN model, and 54.1 % for the                        events, and they are often hard to be identified.
SVM pipeline. Moreover, with the idea of salience
                                                                   2.2 Biomedical Corpora for Event Extraction
in discourse our coreference-based approach helps
us to improve intra-sentence E-A extraction, in par-                  There are two major corpora for biomedical
ticular when arguments have antecedents. In this                   event extraction. One is the GENIA Event Corpus
case adding gold coreference information to MLNs                   (GEC) (Kim et al., 2008), and the other is the data
improves F-score by 16.9%.                                         of the BioNLP’09 shared task. 2 This data is in fact
                                                                   derived from the GEC. There are some important
   In place of gold coreference information, we
                                                                   differences between both corpora.
also experiment with predicted coreferences from a
simple coreference resolver. Although the quality                  event type GEC has fine-grained event type anno-
of predicted coreference information is relatively                 tations (35 classes), while BioNLP’09 data focuses
poor, we show that using this information is still                 on only 9 event subclasses.
better than not using it at all.
   The remainder of this paper is organized as fol-                non-event argument BioNLP’09 data does not
lows: Section 2 describes previous work for event                  differentiate between protein, gene and RNA,
                                                                   while the GEC corpus does.
    1
      e.g. If “The region” is a Theme of “inducible” and “The
                                                                     2
region” is coreferent to “The IRF-2...”, then “The IRF-2...” is        http://www-tsujii.is.s.u-tokyo.ac.jp/
also a Theme of “inducible”.                                       GENIA/SharedTask/



                                                              91
coreference annotation           Both GEC and                Japanese newswire text corpus, 5 Taira et al. (2008)
BioNLP’09 corpora provide coreference an-                    do consider cross-sentence E-A extraction. How-
notations related to event extraction. However,              ever, they directly extract cross-sentence links
in the case of the BioNLP’09 data coreference                without considering coreference relations. In ad-
information primarily concerns protein names and             dition, their approach is based on a pipeline of
abbreviations that follow in parenthesis. The GEC,           SVM classifiers, and their achieved performance
on the other hand, provides proper cross-sentence            on cross-sentence E-A extraction was generally
coreference. Moreover, the sheer number of                   low. 6
coreference annotations is much higher 3 .                   2.4 The Direction of Our Work
   For our work we choose the GEC, primarily be-                We present a new approach that exploits coref-
cause of the amount and quality of coreference               erence information for E-A relation extraction.
information it provides. This allows us to train             Moreover, in contrast to previous work on the
a coreference resolver, as well as testing our hy-           BioNLP’09 shared task we apply our models in a
potheses when gold coreference annotations are               more realistic setting. Instead of relying on gold
available. A secondary reason to prefer GEC over             protein annotations, we use a Named Entity tagger;
the BioNLP’09 corpus is its fine-grained annota-             and instead of focusing on the coarse-grained anno-
tion. We believe that this setting is more realistic.        tation of the BioNLP task, we work with the GEC
2.3 Issues of Previous Work                                  corpus and its the fine-grained ontology.
   Various approaches have been proposed for                    From now on, for brevity, we call cross-
event-argument relation extraction on biomedical             sentence event-argument relations just “cross-
text. However, even the current state-of-the-art             links” and intra-sentence event-argument relations
does not exploit coreference relations and focuses           “intra-links”.
exclusively on intra-sentence E-A extraction.                   We propose two coreference-based models. One
   For example, Björne et al. (2009) achieved the           is an SVM based model that extracts intra-links
best results for Task 1 in the BioNLP’09 competi-            first and then cross-links as a post-processing step.
tion 4 . However, they neglected all cross-sentence          The other is a joint model defined with Markov
E-A. They also reported that they did try to detect          Logic that jointly extracts intra-links and cross-
cross-sentence arguments directly without the use            links and allows us to model salience of discourse
of coreference. But this approach did not lead to            in a principled manner.
reasonable performance increase.                             3 Coreference Based Approach
   In BioNLP’09, Riedel et al. (2009) proposed a                We have two ideas for incorporating coreference
joint Markov Logic Network to tackle the task, and           information into E-A relation extraction.
achieved the best results for Task 2. Their sys-              • Aggressively extracting valuable E-A relations
tem makes use of global features and constraints,                based on “salience in discourse”
and performs event trigger and argument detection             • Predicting cross-links by using “transitivity”
jointly. Poon and Vanderwende (2010) also ap-                    including coreference relations
plied Markov Logic and achieved competitive per-
                                                             According to these ideas, we propose two ap-
formance to the state-of-the-art result of (Björne
                                                             proaches. One is a pipeline model based on SVM
et al., 2009). However, in both cases no cross-
                                                             classifiers, and the other is a joint model based
sentence information is exploited.
                                                             Markov Logic.
   To summarize, so far there is no research
                                                                Before we present these approaches in detail, let
within biomedical event extraction which ex-
                                                             us first describe coreference resolution as a pre-
ploits coreference relations and tackles cross-
                                                             processing step.
sentence E-A relation extraction. By contrast,
for predicate-argument relation extraction in a                 3.1 Coreference Resolution
    3
                                                                   There are some previous work for coreference
      Björne et al. (2009) also mentioned that coreference re-
lations could be helpful to cross-sentence E-A extraction but
                                                                resolution on biomedical domains (Yang et al.,
the necessary coreference annotation to train a coreference re- 2004; Su et al., 2008). However, in our work, we
solver is not presented in BioNLP’09 data.                      introduce a simple coreference resolver based on a
    4
    BioNLP’09 has three tasks 1, 2, and 3. Task 1 is core
                                                                5
event extraction and mandatory. Our work also focuses on            http://cl.naist.jp/nldata/corpus/
                                                                6
Task 1.                                                             Low 20s% F1



                                                        92
                        Table 1: Used Local Features for SVM Pipeline and MLN Joint
                                                          SVM 1st phase          SVM 2nd phase
        Description                                     event & eventT ype         role (E-A)       MLN predicate
        Word Form                                               X                      X            word(i, w)
        Part-of-Speech                                          X                      X            pos(i, p)
        Word Stem                                               X                      X            stem(i, s)
        Named Entity Tag                                        X                      X            ne(i, n)
        Chunk Tag                                               X                      X            chunk(i, c)
        In Event Dictionary                                     X                      X            dict(i, d)
        Has Capital Letter                                      X                      X            capital(i)
        Has Numeric Characters                                  X                      X            numeric(i)
        Has Punctuation Characters                              X                      X            punc(i)
        Character Bigram                                        X                                   bigram(i, bi)
        Character Trigram                                       X                                   trigram(i, tri)
        Dependency label                                        X                       X           dep(i, j, d)
        Labeled dependency path between tokens                                          X           path(i, j, pt)
        Unlabeled dependency path between tokens                                        X           pathN L(i, j, pt)
        Least common ancester of dependency path                                        X           lca(i, j, L)

pairwise coreference model (Soon et al., 2001) 7 . It             weighted first-order logic formulae to instanti-
employs a binary classifier which classifies all pos-             ate Markov Networks of repetitive structure. In
sible pairs of noun phrases into “corefer” or “not                Markov Logic users design predicates and formu-
corefer”. Popular external resources like WordNet                 lae to describe their problem. Then they use soft-
do not work in biomedical domain. Hence, our re-                  ware packages such as Alchemy 8 and Markov the-
solver identifies coreference relations with only ba-             beast 9 in order to perform inference and learning.
sic features such as word form, POS, and NE tag,                     It is difficult to construct Markov Logic Net-
and achieves 59.1 pairwise F1 on GEC evaluating                   works for joint E-A relation extraction and coref-
5-fold cross validation.                                          erence resolution across a complete document.
3.2 SVM Pipeline Model                                            Hence we follow the two strategies: (1) restric-
   In our pipeline we apply the SVM model pro-                    tion of argument candidates based on coreference
posed by (Björne et al., 2009). Their original                   relations; (2) construction of a joint model which
model first extracts events and event types with                  jointly identifies intra-links and cross-links. Re-
a multi-class SVM (1st phase). Then it identi-                    stricting argument candidates helps us to construct
fies the relations between all pairs of event-proteins            a very compact but effective model. A joint model
and event-events by another multi-class SVM (2nd                  enables us to simultaneously extract intra-links and
phase). Note that, on our setting, the 1st phase clas-            cross-links and contributes to improve the perfor-
sifies event types into 36 classes (35 types + “Not-              mance. In addition, we will see that this setup still
Event”). Moreover, while protein annotations were                 allows us to implement the idea of salience in dis-
given in the BioNLP’09 shared task, for the GEC                   course with global formulae in Markov Logic.
we have extract them using an NE tagger. The fea-                 3.3.1 Predicate Definition
tures we used for the 1st and 2nd phases are sum-                   Our joint model is based on the model pro-
marized in the first and the second columns of Ta-                posed by Riedel et al. (2009). We first define the
ble 1, respectively.                                              predicates of the proposed Markov Logic Network
   After identifying intra-links, our model deter-                (MLN). There are three “hidden” predicates corre-
ministically attaches, for each intra-sentence argu-              sponding to what the target information we want to
ment of an event, all antecedents inside/outside the              extract.
sentence to the same event. Hence we implement                          Table 2: The Three Hidden Predicates
transitivity as a post-processing step. However, it                       event(i)     token i is an event
is difficult for SVM pipeline to implement the idea                 eventType(i, t)    token i is an event with type t
                                                                       role(i, j, r)   token i has an argument j with role r
of salience in discourse. We believe that a Markov
Logic model is preferable in this case.               In this work, role is the primary hidden predicate
3.3 MLN Joint Model                                because   role represents event-argument relations.
  Markov Logic (Richardson and Domingos,              Next we define observed predicates representing
2006) is an expressive template language that uses information   that is available at both train and test
   7                                                                 8
    Yang et al. (2004) also built the same kind of resolver as           http://alchemy.cs.washington.edu/
                                                                     9
a baseline with the original coreference annotations                     http://code.google.com/p/thebeast/



                                                             93
time. We define corefer(i, j), which indicates that             Transitivity (T). We also present an additional idea,
token i is coreferent to token j (they are in the same          Feature Copy (FC).
entity cluster). corefer(i, j) obviously plays an im-
                                                                Salience in Discourse Again, an important advan-
portant role for our coreference-based joint model.
                                                                tage of our joint model with MLN is the implemen-
We list the remaining observed predicates in the
                                                                tation of “salience in discourse”. The entities men-
last column of Table 1.
                                                                tioned over and over again are hence important in
   Our MLN is composed of several weighted for-
                                                                discourse structure and accordingly it is highly pos-
mulae that we divide into two classes. The first
                                                                sible for them to be arguments of some events.
class contains local formulae for event, eventType,
                                                                   In order to implement this idea of salience in
and role. We say that a formula is local if it consid-
                                                                discourse, we add the Formula (SiD) in the first
ers only one atom hidden predicates. The formulae
                                                                row of Table 4. Formula (SiD) captures that if a
in the second class are global: they involve two or
                                                                token j is coreferent to another token k, there is
more atoms of hidden predicates. In our case they
                                                                at least one event related to token j. Our model
consider event, eventType, and role simultaneously.
                                                                with Formula (SiD) prefers coreferent arguments
3.3.2 Basic Local Formulae                                      and aggressively connects them with events. In
   Our local features are based on previous work                addition, our coreference resolver always extracts
(Björne et al., 2009; Riedel et al., 2009) and listed          coreference relations which are related to events,
in Table 1. We exploit two types of formula rep-                since coreference annotations in GEC are always
resentation: “simple token property” and “link to-              related to events.
kens property” defined by Riedel et al. (2009).
   The first type of local formulae describes prop-             Transitivity Another main concept is “transitivity”
erties of only one token and such properties are                for intra/cross-link extraction. 11 As mentioned
represented by the predicates in the first section of           earlier, the SVM pipeline enforces transitivity as
Table 1. The second type of local formulae rep-                 a post-processing step.
resents properties of token pairs and linked tokens                For the MLN joint model, let us consider the ex-
property predicates (dep, path, pathN L, and lca)               ample of Figure 1 again.
in the second section of Table 1.                                    role(13, 11, Theme) ∧ corefer(11, 4)
3.3.3 Basic Global Formulae                                                         ⇒ role(13, 4, Theme)
   Our global formulae are designed to enforce con-             This formula denotes that, if an event “inducible”
sistency between the three hidden predicates and                has “The region” as a Theme and “The region” is
are shown in Table 3. Riedel et al. (2009) presented            coreferent to “The IRF-2 promoter region”, then
more global formulae for their model. However,                  “The IRF-2 promoter region” is also a Theme of
some of these do not work well for our task setting             “inducible”. The three atoms, role(13, 11, Theme),
on the GENIA Event Corpus. We obtain the best                   corefer(11, 4), and role(13, 4, Theme) in this for-
results by only using global formulae for ensuring              mula are respectively corresponding to the three
consistency of the hidden predicates.                           arrow edges (A), (B), and (C) in Figure 1. This
                                                                formula is generalized as Formula (T) shown in the
3.4 Using Coreference Information
                                                                second row of Table 4.
   We explain our coreference-based approaches
                                                                   The merit of using Formula (T) is that we can
with Figure 1. For our Markov Logic Network
                                                                take care of cross-links by only solving intra-links
let us describe the relations in Figure 1 with pred-
                                                                and using the associated coreference relations.
icates. First, the two intra-links in S2 are de-
                                                                Candidate arguments of cross-links are the only
scribed by role(13, 11, Theme) – Arrow (A) and
                                                                arguments which are coreferent to intra-sentence
role(13, 15, Cause) – Arrow (D) 10 . Next, we rep-
                                                                mentions (antecedents).
resent the coreference relation by corefer(11, 4) –
Bold Line (B). Finally, we express the cross-link as               The improvement by Formula (T) depends on
role(13, 4, Theme) – Arrow (C).                                 the performance of intra-link role(i, j, r) and coref-
   With the example in Figure 1, we explain the two             erence relation corefer(j, k). Clearly, this perfor-
main concepts : Salience in Discourse (SiD) and                 mance depends partially on the effectiveness of
                                                                Formula (T) formula above. It should also be clear
   10
      In these terms, phrasal arguments are driven by anchor
                                                                  11
tokens which are the ROOT tokes on dependency subtrees of            An antecedent of an argument is sometimes in a subordi-
the phrases                                                     nate clause within a same sentence



                                                           94
                                              Table 3: Basic Global Formulae
                         Formula                               Description
                         event(i) ⇒ ∃t.eventT ype(i, t)        If there is an event there should be an event type
                         eventT ype(i, t) ⇒ event(i)           If there is an event type there should be an event
                         role(i, j, r) ⇒ event(i)              If j plays the role r for i then i has to be an event
                         event(i) ⇒ ∃j.role(i, j, T heme)      Every event relates to need at least one argument.

                                              Table 4: Coreference Formulae
   Symbol     Name                    Formula                                         Description
    (SiD)     Salience in Discourse   corefer(j, k) ⇒ ∃i.role(i, j, r) ∧ event(i)     If a token j is coreferent to another token k, there
                                                                                      is at least one event related to token j
       (T)    Transitivity            role(i, j, r) ∧ corefer(j, k) ⇒ role(i, k, r)   If j plays the role r for i and j is coreferent to k
                                                                                      then k also plays the role r for i
       (FC)   Feature Copy            corefer(j, k) ∧ F (k, +f ) ⇒ role(i, j, r)      If j is coreferent to k and k has feature f then j
                                                                                      plays the role r for i

that the improvement due to Formula (SiD) are also                      for dependency path features we apply Charniak-
affected by Formula (T) formula because it impacts                      Johnson reranking parser with Self-Training pars-
on ∃i.role(i, j, r) in Formula (SiD). Thus, the for-                    ing model 13 , and convert the results to dependency
mulae representing the Salience in Discourse and                        tree with pennconverter 14 . Learning and infer-
Transitivity interact with each other.                                  ence algorithms for joint model are provided by
                                                                        Markov thebeast 15 , a Markov Logic engine tai-
Feature Copy We implement additional usage of
                                                                        lored for NLP applications. Our pipeline model
coreference information through “Feature Copy”.
                                                                        employs SVM-struct 16 both in learning and test-
Anaphor arguments such as “The region” in Figure
                                                                        ing. For coreference resolution, we also employ
1 are sometimes more difficult to be identified than
                                                                        SVM-struct for binary classification.
“The IRF-2 promoter region” because of the lack
of basic features (e.g. POS). Feature Copy supple-
ments the features of an anaphor by adding the fea-
tures of its antecedent. According to the example
of Figure 1, the formula,
      corefer(11, 4) ∧ word(4, “IRF-2”)
                  ⇒ role(13, 11, Theme)
injects a word feature “IRF-2” to anaphor “The re-                             Figure 3: Figure of Experimental Setup
gion” in S2. Here word(i, w) represents a feature
                                                                           Figure 3 shows a structure of our experimental
that the child token of the token i on the depen-
                                                                        system. Our experiments perform the following
dency subtree is word w. To be exact, this for-
                                                                        steps. (1) First we perform preprocessing (tagging
mula allows us to employ additional features of the
                                                                        and parsing). (2) Then we perform coreference res-
antecedent to solve the link role(13, 11, Theme).
                                                                        olution for all the documents and generate lists of
This formula is generalized as Formula (FC) in the
                                                                        token pairs that are coreferent to each other. (3) fi-
last row of Table 4. In Formula (FC), F denotes
                                                                        nally, we train the event extractors: SVM pipeline
the predicates which represent basic features such
                                                                        (SVM) and MLN joint (MLN) involving corefer-
as word, POS, and NE tags of the tokens. For-
                                                                        ence relations. We evaluate all systems using 5-
mula (FC) copies the features of cross-sentence ar-
                                                                        fold cross validation on GEC.
guments (antecedents) to intra-sentence arguments
(anaphors). Feature Copy is not a novel idea but                        5 Results
contributes to improve performance. The SVM
pipeline model also add the same features.          In the following we will first show the results of
                                                 our models for event extraction with/without coref-
4 Experimental Setup
                                                 erence information. We will then present more de-
  Let us summarise the data and tools we employ. tailed results concerning E-A relation extraction.
The data for our experiments is GENIA Event Cor-    13
                                                       http://www.cs.brown.edu/˜dmcc/
pus (GEC) (Kim et al., 2008). For feature gen- biomedical.html
eration, we employ the following tools. POS and     14
                                                       http://nlp.cs.lth.se/software/
NE tagging are performed with GENIA Tagger , 12  treebank_converter/
                                                    15
                                                                           http://code.google.com/p/thebeast/
  12                                                                      16
   http://www-tsujii.is.s.u-tokyo.ac.jp/                                   http://www.cs.cornell.edu/People/tj/
GENIA/tagger/                                                           svm_light/svm_struct.html



                                                                   95
5.1 Impact of Coreference Based Approach                       5.2 Detailed Results for Event-Argument Rela-
                                                                   tion Extraction
     Table 5: Results of Event Extraction (F1)
  System      Corefer   event   eventType      role
                                                                 Table 6 shows the three types of E-A relations
  (a) SVM     NONE      77.0       67.8     52.3 ( 0.0)        we evaluate in detail.
  (b) SVM      SYS      77.0       67.8     53.6 (+1.3)            Table 6: Three Types of Event-Argument
  (b′ ) SVM   GOLD      77.0       67.8     55.4 (+3.1)           Type      Description                     Edge in Figure 1
  (c) MLN     NONE      80.5       70.6     51.7 ( 0.0)           Cross     E-A relations crossing sen-       Arrow (C)
  (g) MLN      SYS      80.8       70.8     53.8 (+2.1)                     tence boundaries (cross-link)
  (g′ ) MLN   GOLD      81.2       70.8     56.7 (+5.0)         W-ANT       Intra-sententence        E-As      Arrow (A)
                                                                            (intra-link) with antecedents
   We begin by showing the SVM and MLN re-              Normal              Neither Cross nor W-ANT
                                                                                              Arrow (D)
sults for event extraction in Table 5. We present
F1-values of event, eventType, and role (E-A re-        They correspond to the arrows (A), (C), and (D)
lation). The three columns (event, eventType, and in Figure 1, respectively. We show the detailed re-
role) in Table 5 correspond to the hidden predicates sults of E-A relation extraction in Table 7. The all
in Table 2.                                          scores shown in the table are F1-values.
                                                      Table 7: Results of E-A Relation Extraction (F1)
   Let us consider the rows of (a)-(b) and (c)-        System      Corefer    Cross     W-ANT      Normal
(g). They compare SVM and MLN approaches               (a) SVM      NONE       0.0       56.0        53.6
with and without the use of coreference informa-       (b)  SVM      SYS       27.9      57.0        54.3
                                                       (b′ ) SVM    GOLD       54.1      57.3        55.4
tion. The column “Corefer” indicates how to in-        (c) MLN      NONE       0.0    49.8 ( 0.0)    53.2
clude coreference information: “NONE”– without         (d) MLN        FC       0.0    51.5 (+1.7)    53.7
                                                       (e) MLN     FC+SiD      0.0    54.6 (+4.8)    53.3
coreference; “SYS”– with coreference resolver;         (f) MLN      FC+T       36.7   51.7 (+1.9)    53.7
“GOLD”– with gold coreference annotations.             (g) MLN   FC+SiD+T      39.3   56.5 (+6.7)    54.3
                                                                (g′ ) MLN      GOLD          69.7     66.7 (+16.9)    55.3
   We note that adding coreference information
leads to 1.3 points F1 improvement for the SVM                 5.2.1 SVM pipeline Model
pipeline, and a 2.1 points improvement for MLN                    The first part of Table 7 shows the results of the
joint. Both improvements are statistically signifi-            SVM pipeline with/without coreference relations.
cant. 17 With gold coreference information, Sys-               Systems (a), (b) and (b′ ) correspond to the first
tems (b′ ) and (g′ ) clearly achieve more significant          three rows in Table 5, respectively. We note that the
improvements.                                                  SVM pipeline manages to extract cross-links with
   Let us move on to the comparisons between                   an F1 score of 27.9 points with coreference infor-
SVM pipeline and MLN joint models.              For            mation from the resolver. The third row in Table
event and eventType we compare row (b) with                    7 shows the results of the system with gold coref-
row (g) and observe that the MLN outperforms                   erence which is extended from System (b). With
the SVM. This is to be contrasted with results                 gold coreference, the SVM pipeline achieves 54.1
for the BioNLP‘09 shared task, where the SVM                   points for “Cross”. However, the improvement
model (Björne et al., 2009) outperformed the                  we get for “W-ANT” relations is small since the
MLN (Riedel et al., 2009). This contrast may stem              SVM pipeline model employs only Feature Copy
from the fact that GEC events are more difficult               and Transitivity concepts. In particular, it cannot
to extract due to a large number of event types                directly exploit Salience in Discourse as a feature.
and lack of gold protein annotations, and hence lo-
                                                    5.2.2 MLN joint Model
cal models are more likely to make mistakes that
global consistency constraints can rule out.           How does coreference help our MLN approach?
                                                    To answer this question, the second part of Table
   For role extractions (E-A relation), SVM 7 shows the results of the following six systems.
pipeline and MLN joint show comparable results, The row (c) corresponds to the fourth row of Ta-
at least when not using coreference relations. How- ble 5 and shows results for the system that does
ever, when coreference information is taken into not exploit any coreference information. Systems
account, the MLN profits more. In fact, with (d)-(g) include Formula (FC). In the sixth (e) and
gold coreference annotations, the MLN outper- the seventh (f) rows, we show the scores of MLN
forms SVM pipeline by 1.3 points margin.            joint with Formula (SiD) and Formula (T), respec-
                                                    tively. Our full joint model with both (SiD) and (T)
   17
      ρ < 0.01, McNemar’s test 2-tailed             formulae comes in the eighth row (g). System (g′ )


                                                          96
is an extended system from System (g) with gold              sentence E-A extraction.
coreference information.                                    Finally, the potential for further improvement
   By comparing Systems (d)(e)(f) with System (c),       through coreference-based approach is limited by
we note that Feature Copy (FC), Salience in Dis-         the performance on intra-links extraction. More-
course (SiD), and Transitivity (T) formulae all suc-     over, we also observe that the 20% of cross-links
cessfully exploit coreference information. For “W-       are cases of zero-anaphora. Here the utility of
ANT”, Systems (d) and (e) outperform System              coreference information is naturally limited, and
(c), which establishes that both Feature Copy and        our Formula (T) cannot come into effect due to
Salience in Discourse are sensible additions to an       missing corefer(j, k) atoms.
MLN E-A extractor. On the other hand, for “Cross         6 Conclusion and Future Work
(cross-link)”, System (f) extracts cross-sentence E-
A relations, which demonstrates that Transitivity is        In this paper we presented a novel approach to
important, too. Next, for cross-link, our full system    event extraction with coreference relations. Our
(g) achieved 39.3 points F1 score and outperformed       approach incorporates coreference relations with
System (c) with 6.7 points margin for “W-ANT”.           two concepts of salience in discourse and transi-
The further improvements with gold coreference           tivity. The coreferent arguments we focused on
are shown by our full system (g′ ). It achieved          are generally valuable for document understand-
69.7 points for “Cross” and improved System (c)          ing in terms of discourse structure and they should
by 16.9 points margin for “W-ANT”.                       be aggressively extracted. We proposed two mod-
                                                         els: SVM pipeline and MLN joint and they im-
5.2.3 SVM Pipeline VS MLN Joint                          proved the attachments of intra-sentence and cross-
   The final evaluation compares SVM pipeline and        sentence related to coreference relations. Further-
MLN joint models. Let us consider Table 7 again.         more, we confirmed that the more improvements
When comparing System (a) with System (c), we            of coreference resolution led to the higher perfor-
notice that the SVM pipeline (a) outperforms the         mance of event-argument relation extraction.
MLN joint model in “W-ANT” without corefer-                 However, potential for further improvement
ence information. However, when comparing Sys-           through coreference-based approach is limited by
tems (b) and (g) (using coreference information by       the performance of intra-sentence links and zero-
the resolver), MLN result is very competitive for        anaphora cases. To overcome this problem, we
“W-ANT”, 11.4 points better for “Cross”.                 plan to propose a collective approach for a whole
   Furthermore, with gold coreference, the MLN           document. Specifically, we are constructing a joint
joint (System (g′ ) outperforms the SVM pipeline         model of coreference resolution and event extrac-
(System (b′ )) both in “Cross” and “W-ANT” by            tion considering all tokens in a document based on
15.6 points margin and 9.4 points margin, respec-        the idea of Narrative Schema (Chambers and Ju-
tively. This demonstrates that our MLN model             rafsky, 2009). If we take into account of all tokens
will further improves extraction of cross-links and      in a document at one time, we can consider vari-
intra-links with antecedents if we have a better         ous relations between events (event chains) through
coreference resolver.                                    anaphoric chains. But to implement such a joint
   We believe that the reason for these results          model by Markov Logic, we cannot escape from
are two crucial differences between the SVM and          fighting against time and space complexities. So,
MLN models:                                              we are investigating a reasonable approximation
 • With Formula (SiD) in Table 4, MLN joint has          for learning and inference of joint approaches.
   more chances to extract “W-ANT” relations. It
   also effects the first term of Formula (T). By
   contrast, the SVM pipeline cannot easily model
   the notion of salience in discourse and the ef-
   fect from coreference is weak.
 • Formula (T) of MLN is defined as a soft con-
   straint. Hence, other formulae may reject a sug-
   gested cross-link from Formula (T). The SVM
   pipeline deterministically identifies cross-links
   and is hence more prone to errors in the intra-


                                                    97
References                                                  Hirotoshi Taira, Sanae Fujita, and Masaaki Nagata.
Jari Björne, Juho Heimonen, Filip Ginter, Antti Airola,      2008. A japanese predicate argument structure analy-
   Tapio Pahikkala, and Tapio Salakoski. 2009. Ex-            sis using decision lists. In EMNLP ’08: Proceedings
   tracting complex biological events with rich graph-        of the Conference on Empirical Methods in Natural
   based feature sets. In BioNLP ’09: Proceedings of          Language Processing, pages 523–532, Honolulu, HI,
   the Workshop on BioNLP, pages 10–18, Boulder, CO,          USA. Association for Computational Linguistics.
   USA. Association for Computational Linguistics.          Xiaofeng Yang, Guodong Zhou, Jian Su, and Chew Lim
Ekaterina Buyko, Erik Faessler, Joachim Wermter, and          Tan. 2004. Improving noun phrase coreference reso-
   Udo Hahn. 2009. Event extraction from trimmed              lution by matching strings. In Proceedings of 1st In-
   dependency graphs. In BioNLP ’09: Proceedings of           ternation Joint Conference of Natural Language Pro-
   the Workshop on BioNLP, pages 19–27, Boulder, CO,          cessing, pages 326–333.
   USA. Association for Computational Linguistics.
Nathanael Chambers and Dan Jurafsky. 2009. Unsu-
   pervised learning of narrative schemas and their par-
   ticipants. In Proceedings of the Joint Conference of
   the 47th Annual Meeting of the ACL and the 4th Inter-
   national Joint Conference on Natural Language Pro-
   cessing of the AFNLP, pages 602–610, Suntec, Sin-
   gapore, August. Association for Computational Lin-
   guistics.
Barbara J. Grosz, Scott Weinstein, and Aravind K.
   Joshi. 1995. Centering: A framework for model-
   ing the local coherence of discourse. Computational
   Linguistics, 21:203–225.
Jin-Dong Kim, Tomoko Ohta, and Jun’ichi Tsujii.
   2008.      Corpus annotation for mining biomedi-
   cal events from literature. BMC Bioinformatics,
   9(1):10+.
Jin-Dong Kim, Tomoko Ohta, Sampo Pyysalo, Yoshi-
   nobu Kano, and Jun’ichi Tsujii. 2009. Overview of
   bionlp’09 shared task on event extraction. In BioNLP
   ’09: Proceedings of the Workshop on BioNLP, pages
   1–9, Boulder, CO, USA. Association for Computa-
   tional Linguistics.
Hoifung Poon and Lucy Vanderwende. 2010. Joint
   inference for knowledge extraction from biomedi-
   cal literature. In Human Language Technologies:
   The 2010 Annual Conference of the North Ameri-
   can Chapter of the Association for Computational
   Linguistics, pages 813–821, Los Angeles, California,
   June. Association for Computational Linguistics.
Matthew Richardson and Pedro Domingos. 2006.
   Markov logic networks. Machine Learning, 62(1-
   2):107–136.
Sebastian Riedel, Hong-Woo Chun, Toshihisa Takagi,
   and Jun’ichi Tsujii. 2009. A markov logic approach
   to bio-molecular event extraction. In BioNLP ’09:
   Proceedings of the Workshop on BioNLP, pages 41–
   49, Boulder, CO, USA. Association for Computa-
   tional Linguistics.
Wee Meng Soon, Hwee Tou Ng, and Daniel
   Chung Yong Lim. 2001. A machine learning ap-
   proach to coreference resolution of noun phrases.
   Computational Linguistics, 27(4):521–544.
Jian Su, Xiaofeng Yang, Huaqing Hong, Yuka Tateisi,
   and Jun’ichi Tsujii. 2008. Coreference resolution in
   biomedical texts: a machine learning approach. In
   Michael Ashburner, Ulf Leser, and Dietrich Rebholz-
   Schuhmann, editors, Ontologies and Text Mining
   for Life Sciences : Current Status and Future Per-
   spectives, number 08131 in Dagstuhl Seminar Pro-
   ceedings, Dagstuhl, Germany. Schloss Dagstuhl -
   Leibniz-Zentrum fuer Informatik, Germany.



                                                       98