=Paper=
{{Paper
|id=None
|storemode=property
|title=Identification of negated regulation events in the literature: exploring the feature space
|pdfUrl=https://ceur-ws.org/Vol-714/ShortPaper07_Sarafraz.pdf
|volume=Vol-714
|dblpUrl=https://dblp.org/rec/conf/smbm/SarafrazN10
}}
==Identification of negated regulation events in the literature: exploring the feature space==
<pdf width="1500px">https://ceur-ws.org/Vol-714/ShortPaper07_Sarafraz.pdf</pdf>
<pre>
         Identification of Negated Regulation Events in the
              Literature: Exploring the Feature Space

                             Farzaneh Sarafraz1, Goran Nenadic1, 2
                 1
                  School of Computer Science, University of Manchester, Manchester, UK
                     2
                     Manchester Interdisciplinary BioCentre, University of Manchester, UK
        Email addresses: FS: sarafraf@cs.man.ac.uk; GN: g.nenadic@manchester.ac.uk


     Abstract
  Background. Regulation events are of critical importance to researchers trying to understand
  processes in living beings. These events are naturally complex and can involve both individual
  molecular entities and other biomedical events. Of equal importance is the ability to capture
  statements that refer to regulation events that do not take place. In this paper we explore the
  identification of negated regulation events in the literature using a number of features.

  Results. We construe the problem as a classification task and apply support vector machines
  that use lexical, syntactic and semantic features associated with sentences that represent
  events. Lexical features include negation cues, part-of-speech tagging and surface distances,
  whereas syntactic features are engineered from constituency parse trees, the command
  relation between constituents and parse-tree distances. Semantic features include event sub-
  type and participant types. On a test dataset, best precision has been achieved by combing all
  features, while ignoring surface-level distances resulted in best recall. Overall, the best F-
  measure was 54%.

  Conclusions. Syntactic features proved to be useful for improving recall, whereas semantic
  features proved useful for improving precision, demonstrating the potential and limits of task-
  specific feature engineering to negation detection. Contrasting statements are used frequently
  to express negated events and many false negatives were due to not capturing those events.


Background
Several efforts have been recently initiated in the text mining community that focus on the
extraction of structured information about biomedical relations and events, including protein-
protein interactions, gene expression, etc. [1, 2]. These efforts aim for both supporting data
consolidation (population of curated databases) and knowledge exploration (e.g. hypothesis
generation) [3, 4].

A topic that has been of particular interest in biology and medicine is the investigation of gene
regulatory networks, which are of critical importance to researchers trying to understand
regulatory mechanisms in living beings. There have been a number of databases developed to
store knowledge about gene regulation in various model organisms (e.g. RegulonDB with
regulation information in E. coli [5]), but populating such databases proved to be challenging
given the pace of publications and complexity of the events. Regulatory events are particularly
complex as their participants can be either entities (e.g. a protein) or other events. Therefore,
regulation events can be recursively nested, and – given that regulations can be positive
(facilitating a particular process) or negative (inhibiting a particular event) – they typically require
complex linguistic expressions to report and explain regulation findings. In addition to affirmative
findings, a number of events are also reported as negated (e.g. However, NFATc.beta neither
bound to the kappa3 element (an NFAT-binding site) in the tumor necrosis factor-alpha promoter
nor activated the tumor necrosis factor-alpha promoter in cotransfection assays). Detection of


                                                  134
negated events is of particular importance, as it affects the quality and the semantics of the
extracted information. In recent years, several challenges and shared tasks have included the
extraction of negations (e.g. the BioNLP’09 Shared Task 3 [2]).

In this paper we explore various features that can be used for identification of negated regulation
events in a machine learning (ML) method. We examine features mainly engineered from a
sentence parse tree with associated lexical and semantic cues.


Methods
Our target events are regulatory processes and causal relations between different biomedical
entities and processes [2]. Each regulatory event expressed in text is identified by:
     • regulation type – we consider three regulation sub-types: positive regulation and
         negative regulation, in addition to regulation events where there is no indication if it is
         positive or negative (e.g. Involvement of mitogen-activated protein kinase pathways in
         interleukin-8 production by human monocytes);
     • regulation theme represents an entity and event that is regulated;
     • regulation cause – a protein or event that causes regulation;
     • event trigger – a token(s) that indicates presence of the event in the associated
         sentence.

Identification of these components is a challenging text mining task and has been discussed
widely [2, 6, 7]. In order to focus on exploration of the complexity of negations, unaffected by
automatic named entity recognition, event trigger detection, participant identification, etc., we use
pre-identified events: given a sentence that describes an event, we assume that all event
features (listed above) have been identified. Then, we construe the negation detection problem
as a classification task: the aim is to classify the event as affirmative or negative. The method is
based on features engineered from an event-representing sentence as follows.
                                                            1
Lexical features are based on a list of negation cues and part-of-speech (POS) tagging of the
associated sentence. We also consider the surface distance between the negation cue and
trigger, theme and cause. More precisely, the lexical features include:

     1. Whether the sentence contains a negation cue from the cue list;
     2. The negation cue itself (if present);
     3. The POS tag of the negation cue;
     4. The POS tag of the trigger;
     5. The POS tag of the theme; if the theme is another event, the POS tag of the trigger of
        that event is used;
     6. The POS tag of the cause; if the cause is another event, the POS tag of the trigger of
        that event is used;
     7. Surface distance between the trigger and cue;
     8. Surface distance between the theme and cue;
     9. Surface distance between the cause and cue;

Syntactic features are based on the results of constituency parsing of the associated sentence
and the command relation. The concept of command has been introduced by Langacker in order
to determine the scope within a sentence affected by an element [8]. If a and b are nodes in the
constituency parse tree of a sentence, then a X-commands b iff the lowest ancestor of a with
label X is also an ancestor of b. Langacker observed that when a S-commands b, then a affects
the scope containing b. We hypothesise that if a negation cue X-commands an event trigger or
participant, then the associated event is negated. We explored various types of X-command,
including S-command (for sentence or sub-clause), NP-command (noun phrase), VP-command
(verb phrase), PP-command (prepositional phrase), etc. We also consider the distance within the
tree. Specifically, the syntactic features include:


1
     Negation cues include generic expressions (e.g. no, not, none, etc.) as well as 18 task-specific words
    (e.g. unchanged, impaired, little, independent, except, exception, etc.).


                                                   135
     10. The type of the lowest common ancestor of the trigger and the cue (either S, VP, PP,
         NP, JJ or PP);
     11. Whether or not the negation cue X-commands the trigger (X is S, VP, NP, JJ, PP)
     12. Whether or not the negation cue X-commands the theme (X is S, VP, NP, JJ, PP)
     13. Whether or not the negation cue X-commands the cause (X is S, VP, NP, JJ, PP)
     14. The parse-tree distance between the event trigger and the negation cue.
     15. The parse-tree distance between the theme and the negation cue.
     16. The parse-tree distance between the cause and the negation cue.

Semantic features introduce known characteristics of the regulation participants and the sub-
type of regulation (if known):

     17. Regulation sub-type (positive, negative, none);
     18. Theme type, which can be either a protein or one of the nine event types as defined in
         BioNLP’09: gene expression, transcription, protein catabolism, localization,
         phosphorylation, binding, regulation, positive regulation, and negative regulation;
     19. Cause type is defined analogously to the theme type.

The above features have been used to train a series of binary SVM (support vector machine)
classifiers that aim to identify negated regulation events. The standard metrics (precision, recall
and F1-measure) were used to evaluate the results, where a true positive represents a correctly
identified negated event; a false negative is a negated event reported incorrectly as affirmative.

Results
The data used in this study is provided by the BioNLP’09 challenge [2]. The training set contained
a total of 4,870 regulation events, with 440 of these reported as negated; the test set contained
987 regulation events, of which 66 are negated. The associated sentences are annotated with
event types (the nine types as specified above), textual trigger and participants. In addition, every
event has been tagged as either affirmative (reporting a specific interaction) or negative
(reporting that a specific interaction has not been observed). The training data was used for
modelling and all the results refer to the methods applied on the development dataset using 10-
cross validation. Constituency parse trees were produced the method described by [9].

Impact of lexical features. The results of using lexical features only are presented in Table 1.
As expected, surface distances to the negation cue are not good indicators, and do not improve
the performance of standard lexical and POS features – on the contrary, they reduce precision.
Overall, precision is relatively high but recall is low.

                   Lexical features                Precision      Recall      F1
                   Features 1-6
                                                     75.00        22.73     34.88
                   (no surface distances)
                   All lexical features              71.43        22.73     34.48
                          Table 1. The results of using lexical features only

Impact of syntactic features. The results of using syntactic features only are presented in Table
2. As opposed to surface distances, parse-tree distances are more suitable features, improving
the overall performance significantly (F1 improving from 11% to 36%). There were no significant
differences in performance when different types of X-command relations are used (data not
shown) – focusing only on S- and VP-command provides the same levels of accuracy.

                   Syntactic features             Precision      Recall      F1
                   Features 10-13
                                                    80.00         6.06     11.27
                   (no parse-tree distances)
                   All syntactic features           60.71        25.76     36.17
                        Table 2. The results of using syntactic features only

Impact of semantic features. Although there are no significant differences in the results when
lexical or syntactic features are used, semantic features on their own resulted in very low
performances, virtually missing all negated regulatory events (data not shown).


                                                136
Combining features. Table 3 shows the results when features of various types are combined.
Combining several feature types (lexical, syntactic and semantic) proved to be beneficial. Surface
distances still reduce the overall precision, but overall improve recall. It is interesting that adding
semantic features (which characterise the participants involved in the regulation) significantly
improves precision (by 20% when compared to the lexical and syntactic feature sets). On the
other hand, command relations improve recall (by almost 20%).

         Features                                          Precision       Recall       F1
         Lexical + syntactic                                      66.67     39.39      49.52
         Lexical + semantic                                       50.00     15.15      23.26
         Syntactic + semantic                                     72.22     19.70      30.95
         All with no surface distances                            73.68     42.42      53.85
         All with no X-command on theme and cause                 78.12     37.88      51.02
         All features                                             78.79     39.39      52.53
                         Table 3. The results of combining different features

We note that some feature subsets (e.g. features 10-13, Table 2) do not provide a balance
between precision and recall; depending on the application, the classification threshold could be
adjusted to produce higher recall or precision.

Discussion
Several approaches have recently been suggested for the extraction of negated biomedical
events. Kilicoglu and Bergler [6] and Hakenberg et al. [10] used a number of heuristic rules
concerning the type of the negation cue and the type of the dependency relation. The former
were the best performing negation detection approach in the BioNLP’09 shared task and reported
recall of up to 15%, but with overall event detection sensitivity of 33% on a ‘test’ dataset different
from that used in this study, and including simpler, non-regulatory events [6]. MacKinlay and
colleagues also use ML, assigning a vector of complex deep parse features to every event trigger
[7]. They achieved an F-score of 36% on a dataset containing both nested regulatory event and
simpler events. Morante and Daelemans used machine learning to detect the negation scope in
biomedical text, but have not separately addressed what could negate a biomedical event [11].
We have explored not only negation of triggers but also phrases in which regulation theme and
cause have been negated (consider, for example, “SLP-76” in sentence “In contrast, Grb2 can be
coimmunoprecipitated with Sos1 and Sos2 but not with SLP-76”). These have resulted in a slight
improvement of both precision and recall.

Analysing the errors more closely shows that one of the recurring patterns contributing to a large
portion of the false negative results were the contrasting patterns. It often happens that the
authors express contrasting observations by describing one event and implying the other is the
opposite. For example, consider this sentence
         Unlike TNFR1, LMP1 can interact directly with receptor-interacting protein (RIP) and
         stably associates with RIP in EBV-transformed lymphoblastoid cell lines.

In this example, a negated interaction is expressed, but there is no sign of a negation cue or
negative sentence structure. Still, we can infer that TNFR1 cannot interact directly with RIP; it
may also imply that TNFR1 does not stably associate with RIP in certain cell lines. The negation
therefore can only be inferred by taking the following steps:

    1.   recognising the presence of a contrasting pattern in the sentence;
    2.   identifying the contrasting entities (in this example TNFR1 and LMP1);
    3.   extracting the explicitly stated event (LMP1 interacts with RIP in this case);
    4.   identifying the scope of contrast; this can be ambiguous, as in the above example it is not
         clear whether the two entities also contrast in "stably associates with RIP", or only in
         "interact directly with RIP".

Contrasting patterns are not uncommon. There are 125 phrases expressing contrast in the
training data (in 800 abstracts) and 32 in the development data (150 abstracts) using only the


                                                 137
patterns "unlike A, B", "B, unlike A", and "A; in contrast B". In these cases, the negation is usually
not linguistically explicit, and has to be inferred by analysing the contrasts. The future work will
explore a rule-based framework that would identify contrasting patterns and entities, and treat
such expressions separately from explicit negations, for which a ML approach could still be
useful. We will also further explore the feature space by considering attributes and relations
extracted from the constituency parse to provide a more comprehensive classification model.

Conclusions
Given the number of published articles, detection of negations is of particular importance for data
consolidation and mining. Here we explored the identification of negated regulation events, given
their triggers, themes and causes. A machine learning method that combines a set of lexical,
syntactic and semantic features engineered from the associated sentence was used. Adding
semantic features (which characterise the participants involved in the regulation) improved
precision by 20%; similarly, adding syntactic relations improve recall by almost 20%. A number of
false negatives originated from contrastive patterns that have been used to express both
affirmative and negated statements in parallel. The results suggested that ML approaches could
not learn from such examples, and that more complex syntactic or lexical patterns are needed to
capture this kind of negations.

References
[1] Krallinger M, Leitner F, Rodriguez-Penagos C, Valencia A. Overview of the protein-protein
interaction annotation extraction task of BioCreative II. Genome Biol. 2008; 9(Suppl 2): S4.
[2] Kim JD, Ohta T, Pyysalo S, Kano Y, Tsujii Y: Overview of BioNLP’09 shared task on event
extraction. BioNLP’09: Proceedings of the Workshop on BioNLP. 1-9.
[3] Donaldson I, Martin J, de Bruijn B, Wolting C, Lay V, Tuekam B, Zhang S, Baskin B, Bader
GD, Michalickova K, Pawson T, Hogue CW. PreBIND and Textomy--mining the biomedical
litera-ture for protein-protein interactions using a support vector machine. BMC
Bioinformatics 4: 11.
[4] Natarajan J, Berrar D., Dubitzky W, Hack C, Zhang Y, DeSesa C, Van Brocklyn JR, Bremer
EG. Text mining of full-text journal articles combined with gene expression analysis
reveals a relationship between sphingosine-1-phosphate and invasiveness of a
glioblastoma cell line. BMC Bioinformatics. 7: 373.
[5] Gama-Castro S, Jiménez-Jacinto V, Peralta-Gil M, Santos-Zavaleta A, Peñaloza-Spinola MI,
Contreras-Moreira B, Segura-Salazar J, Muñiz-Rascado L, Martínez-Flores I, Salgado H,
Bonavides-Martínez C, Abreu-Goodger C, Rodríguez-Penagos C, Miranda-Ríos J, Morett E,
Merino E, Huerta AM, Treviño-Quintanilla L, Collado-Vides J. RegulonDB (version 6.0): gene
regulation model of Escherichia coli K-12 beyond transcription, active (experimental)
annotated promoters and Textpresso navigation, Nucleic Acids Res. 2008 (Database
issue):D120-4.
[6] Kilicoglu H, Sabine Bergler. Syntactic dependency based heuristics for biological event
extraction. BioNLP’09: Proceedings of the Workshop on BioNLP. 119-127
[7] MacKinlay A, David Martinez, Timothy Baldwin. Biomedical Event Annotation with CRFs
and Precision Grammars. BioNLP’09: Proceedings of the Workshop on BioNLP. 77-85.
[8] Langacker R. On Pronominalization and the Chain of Command. In D. Reibel and S.
Schane (eds.), Modern Studies in English, Prentice-Hall, Englewood Cliffs, NJ. 160–186, 1969.
[9] McClosky D, Charniak E, Johnson M. Effective Self-Training for Parsing. Proceedings of
HLT/NAACL 2006. 152-159.
[10] Hakenberg J, Solt I, Tikk D, Tari L, Rheinländer A, Ngyuen QL, Gonzalez G, Leser U.
Molecular event extraction from link grammar parse trees. BioNLP’09: Proceedings of the
Workshop on BioNLP. 86-94.
[11] Morante R, Daelemans W. A Metalearning Approach to Processing the Scope of
Negation. CoNLL ‘09: Proceedings of the 13th Conference on Computational Natural Language
Learning. 21-29.


                                                 138

</pre>