<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An Event Extraction System via Neural Networks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alapan Kuila</string-name>
          <email>alapan.cse@iitkgp.ac.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sudeshna Sarkar</string-name>
          <email>sudeshna@cse.iitkgp.ernet.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Indian Institute of Technology Kharagpur</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper we describe the IIT KGP team's participation in the Event Extraction task at FIRE 2017. We have developed an event extraction system which can extract event-phrases from tweets written in Indian language scripts along with Roman script. We designed our system on Hindi language and then used the same system for Malayalam and Tamil languages. We have submitted two systems one uses pipelined architecture another uses non-pipelined architecture. In case of pipelined architecture we first identify the tweets which contain event inside it and then extract the eventphrase from those tweets. In case of non-pipelined system all the tweets are directly pass to the event extraction system. Though conceptually simple, non-pipelined approach gives better result than pipelined approach and achieves F1-score of 50.01, 48.29 and 51.80 on Hindi, Malayalam and Tamil dataset respectively. 1https://tac.nist.gov/2017/KBP/ To overcome the dificulties of complicated feature engineering and domain dependency, researchers use neural network approach for event classification [ 2, 11, 14]. But all these works deal with English language and principle objective of these tasks is to detect the trigger word in the text which indicate an event. Some of these papers also identify the arguments related to these event trigger and their corresponding roles in the events [2, 14, 18].</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        Event Extraction from unstructured text is one of the most
important and problematic task in Information extraction and natural
language processing. Event extraction deals with automatic
extraction of events depicting accidents, crime, natural disasters, political
events etc. from various newswires, discussion forums, social media
texts. Most of the existing event extraction systems [
        <xref ref-type="bibr" rid="ref14 ref2 ref8">2, 8, 14</xref>
        ] deals
with English texts where main objective is to detect event trigger
words and to classify those trigger words among predefined event
classes [
        <xref ref-type="bibr" rid="ref11 ref14">11, 14</xref>
        ]. Though there exists several successful works for
English language such as ACE, TAC1 evaluation tracks but there
is no such standard event extraction tool for Indian Languages.
The Event extraction task at FIRE 2017 aims to identify and extract
events from newswires and social media text specifically tweets.
The tweets are written in three Indian language scripts: Hindi,
Malayalam and Tamil along with romanized script. Unlike typical
event extraction systems[
        <xref ref-type="bibr" rid="ref14 ref8">8, 14</xref>
        ] where the objective is to detect the
trigger words from sentences and classify the words to a predefined
event types, the FIRE 2017 shared task on event extraction deals
with extraction of event-phrase (which depicts any event) from
the given tweets. In this paper, we present the system we
developed for this event extraction task at FIRE 2017 which deals with
event extraction from newswires and social media text in Indian
languages.
      </p>
    </sec>
    <sec id="sec-2">
      <title>RELATED WORK</title>
      <p>
        Many approaches have been taken to extract events from text.
Judea and Strube,2015 formulated the event extraction problem
as frame-semantic parsing [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. McClosky et al.,2011 [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] uses
dependency parsing to extract events. Previously researchers use
feature based approach to extract events [
        <xref ref-type="bibr" rid="ref18 ref3 ref9">3, 9, 18</xref>
        ]. But features
are domain dependent and needs huge linguistic knowledge [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
3
Event extraction task at Fire 2017 requires participants to detect
event-phrase from given tweets. In the training set tweets are
written in three Indian languages: Hindi, Malayalam and Tamil along
with romanized script. The objective is to detect the phrase within
the tweet which depicts events such as natural disasters(floods,
earthquakes etc), man made disasters (accidents, crime etc),
political events (inaugurations by political leaders, poltical rallies etc),
cultural/social events (Seminars, Conferences, light music events
etc).
4
      </p>
    </sec>
    <sec id="sec-3">
      <title>DATASETS</title>
      <p>Dataset contain tweets written in both Indian languages and
Roman script. Three Indian languages are: Hindi, Malayalam and
Tamil. Training dataset contains two file for each language. One file
contains all the tweets obtained using the Twitter API. Another
annotation file contains event phrases extracted from tweets present
in previous file. Each line in the annotation file contains: tweet-id,
user-id, Event phrase of the tweet, index where this phrase starts
in the tweet string, string length of the event phrase. Test file
contains only the tweets with corresponding tweet-id and user-id. The
details of the training and test dataset is depicted in the Table 1.</p>
      <sec id="sec-3-1">
        <title>Language</title>
      </sec>
      <sec id="sec-3-2">
        <title>Training data</title>
      </sec>
      <sec id="sec-3-3">
        <title>No of events in annotation ifle</title>
        <p>402
674
1109</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>SYSTEM DESCRIPTION</title>
      <p>In this section we describe our event extraction system. We have
experimented with two types of event extraction systems: 1.
Nonpipelined approach 2. Pipelined approach. We have used
neural networks as the main technique in both the cases.
5.1</p>
    </sec>
    <sec id="sec-5">
      <title>Preprocessing</title>
      <p>The training file contains tweets which are written in mainly Indian
language script with some Romanized script. Some of the tweets
are ending with urls. To avoid data sparseness problem we have
replaced all the urls with a unique token. Event annotation file
contains some event phrases which are taken from same tweets
and indicate same event and the words contained in those
eventphrases are more or less same. We have omitted those redundant
event-phrases.
5.2</p>
    </sec>
    <sec id="sec-6">
      <title>Run1: Non-pipelined approach</title>
      <p>
        In case of non-pipelined approach we have formulated the event
extraction problem as sequence labelling problem. For every token
in the input tweet we have tagged the word with ’0’ or ’1’ i.e. ’outside
event-phrase’ or ’inside event-phrase’ respectively. And for this
task we have used a combination of convolution neural network [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]
along with bidirectional LSTM [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. In order to prepare the input to
the convolution layer we have made a fixed sequence length which
is same as maximum tweet length and also used padding for shorter
sentences with a special token when necessary. We have used an
embedding layer in the neural network to transform each token into
a real valued vector [
        <xref ref-type="bibr" rid="ref13 ref17">13, 17</xref>
        ]. And then the sequence of real valued
vectors is fed to the neural network model. The main neural network
architecture employed here is a combination of convolution neural
network(CNN) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] followed by a bidirectional LSTM [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. Input
to the convolution layer is a matrix of size n ∗ m where n is the
sequence length and m is the dimensionality of the word vector.
CNN pass the input matrix representation through a convolution
layer with a fixed filter length and filter size. And then without
using any pooling layer we have again passed the output of the first
convolution layer to the second convolution layer with another
ifxed filter length and size keeping the sequence length same as
input sequence length. Now this internal representation is of size
n ∗ mc where mc is the dimension of internal vector representation.
This internal vector representation is fed to a bidirectional LSTM
with one hidden layer. The output of the bidirectional LSTM layer is
followed by a softmax layer to compute the probability distribution
over the possible tags of ’0’ or ’1’ for each token in the sequence.
      </p>
      <p>W1</p>
      <p>W2</p>
      <p>W3</p>
      <p>Wn</p>
      <p>Input
Convolution layer</p>
      <p>Two back to back convolution layer
Word Embedding
Feature vector representation</p>
      <p>Forward LSTM
LSTM</p>
      <p>LSTM</p>
      <p>LSTM</p>
      <p>
        LSTM
LSTM
It is noticed in the training corpus that approximately 40% of the
tweets contains event phrases. So it is vacuous to check all the
tweets for extracting event phrase. From this intuition we have
employ ed an event classification module before the event extrction
module, depicted in non-pipelined approach. In case of pipelined
approach first events are classified as event-tweet and without
event-tweets. Tweets which are classified as event-tweets by our
classification module are fed to the event extraction module. Other
tweets which are classified as without event-tweets are discarded
and are not fed to the event-extraction module. The classification
module is similar to [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] where authors have done sentence
modelling and sentence classification using convolution neural
network.
      </p>
      <p>Preprocessing
input tweets</p>
      <p>Cleaned Tweets</p>
      <p>Tweet classification</p>
      <p>Tweets containing events
output:Run2</p>
      <p>event extraction model
5.3.1 Tweet Classification. Here we have used a convolution
nueral network(CNN) based architecture for tweet classification. As
the tweets are of diferent length so padding is applied to make them
of fixed size. Now these padded sequences are fed to an embedding
layer to convert the tokens into a fixed size real-valued vectors.
Then the sequences of fixed size vectors are fed to a convolution
layer followed by maxpooling layer. The internal representation
again fed to a combination of convolution layer followed by a
pooling layer. The model uses multiple filter size to get multiple features.
Now the output is fed to a fully connected softmax layer which
gives the probability distribution over two classes: event-tweet
or without event-tweet. The performance of Tweet-classification
module is reported in Table 2.</p>
      <p>Conv layer</p>
      <p>Pooling Layer</p>
      <p>Concatination
W1
W2
W3
W4
W5
P</p>
      <p>P P: padding token
Input Tweet</p>
      <p>Filter size=3,4,5 Feature maps</p>
      <p>Contextual feature vectors</p>
      <p>Conv and Pooling layer</p>
      <p>Eventually the tweets classified as event-tweets are fed to the
event extraction module described in non-pipelined section. The
architecture of event extraction module in pipelined approach is
same as non-pipelined approach. The only diference is that in
case of pipelined approach at the training time we use only those
tweets which contains events. Tweets which contains no event are
discarded from training data.</p>
      <p>Event extraction module will give the event span(i.e. the event
phrase) within the tweets.</p>
    </sec>
    <sec id="sec-7">
      <title>5.4 Postprocessing</title>
      <p>The event phrase which depicts events inside a tweet consists of
cosecutive word sequences. So after sequence tagging if there exist
’0’s inside sequence of ’1’s then first ’1’ is taken as the strating point
of event-phrase and the last ’1’ in the sequence indicates the ending
of event-phrase. All the tokens inside the boundary are cosidered
as event-pharase. We use this heuristic to maintain the constraint
that all the event-phrases consists of consecutive tokens.</p>
    </sec>
    <sec id="sec-8">
      <title>5.5 Parameters and training</title>
      <p>Event extraction model used in pipelined and non-pipelined
approach uses same architecture and hyperparameters. Regarding
embeddings we have used 100 dimensions for word embedding in
the word embedding layer. The first convolution layer uses filter
size of 3 and number of filters used mf = 30. In second convolution
layer the filter size is 4 and number of filters mh = 20. The
bidirectional LSTM layer uses one hidden layer with hidden layer size
60. For event classification we have used CNN based classification
approach which uses word embedding of size 100. These vectors are
randomly initialized and fed to the embedding layer. We have
employed filter size of {3, 4, 5} with 20 filters for each filter size for the
convolution operation. Finally, we have trained the neural network
models using adam optimizer with sufled minibatches, dropout
rate=0.5, backpropagation for gradient calculation and parameter
modification.</p>
    </sec>
    <sec id="sec-9">
      <title>6 RESULT AND ERROR ANALYSIS</title>
      <p>Table 3 shows the performance of event extraction in all three
languages using both pipelined and non-pipelined approach. While
examining the result in each languge we have found that
nonpipelined system has given better F-score than pipelined approach.
In Hindi dataset Pipeline system acquire F-score of 40.35 but in
non-pipelined approach the F-score is 50.01. For Malayalam the
Fscore in Pipelined and non-pipelined approach are 47.17 and 48.29
respectively which are comparable. But in Tamil non-pipelined
system whose F-score is 51.80 beats pipelined system (F-score:
44.01). Error propagation in pipelined approach may be responsible
for this low performance of pipelined system. The performance of
tweet-classification module directly influenced the event extraction
system in pipelined approach. It is also obvious from the Table.
3 that the precision is very much low in both pipelined and
nonpipelined system. We will investigate on our model to improve the
precision score.</p>
    </sec>
    <sec id="sec-10">
      <title>7 CONCLUSION AND FUTURE WORK</title>
      <p>
        We have taken two strategies for event extraction. In case of
nonpipelined approach we have classified each word with tag ’0’ or ’1’
indicating inside event phrase or outside event-phrase. But there
are many tweets which do not indicate any event. So in pipelined
approach first we have detected those tweets which contain any event
and then identify the span of the event inside the tweet. The
accuracy of the pipelined approach depends on accuracy of the tweet
classification module. So we will try to improve the performance
of tweet-classification module. In our experiment the number of
training tweets are very low. If more training data could be used
the event extraction accuracy may increase. In future we will try to
increase the performance of the event extraction system by using
more training data and other advanced strategies [
        <xref ref-type="bibr" rid="ref1 ref10">1, 10</xref>
        ].
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Yubo</given-names>
            <surname>Chen</surname>
          </string-name>
          , Shulin Liu, Xiang Zhang, Kang Liu, and
          <string-name>
            <given-names>Jun</given-names>
            <surname>Zhao</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Automatically Labeled Data Generation for Large Scale Event Extraction</article-title>
          .
          <source>In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics</source>
          ,
          <string-name>
            <surname>ACL</surname>
          </string-name>
          <year>2017</year>
          , Vancouver, Canada,
          <source>July 30 - August 4</source>
          , Volume
          <volume>1</volume>
          :
          <string-name>
            <given-names>Long</given-names>
            <surname>Papers</surname>
          </string-name>
          .
          <fpage>409</fpage>
          -
          <lpage>419</lpage>
          . https://doi.org/10.18653/v1/
          <fpage>P17</fpage>
          -1038
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Yubo</given-names>
            <surname>Chen</surname>
          </string-name>
          , Liheng Xu, Kang Liu, Daojian Zeng,
          <string-name>
            <given-names>Jun</given-names>
            <surname>Zhao</surname>
          </string-name>
          , et al.
          <year>2015</year>
          .
          <article-title>Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Yu</given-names>
            <surname>Hong</surname>
          </string-name>
          , Jianfeng Zhang, Bin Ma, Jianmin Yao,
          <string-name>
            <surname>Guodong Zhou</surname>
            , and
            <given-names>Qiaoming</given-names>
          </string-name>
          <string-name>
            <surname>Zhu</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Using cross-entity inference to improve event extraction</article-title>
          .
          <source>In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics</source>
          ,
          <fpage>1127</fpage>
          -
          <lpage>1136</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Alex</given-names>
            <surname>Judea</surname>
          </string-name>
          and
          <string-name>
            <given-names>Michael</given-names>
            <surname>Strube</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Event Extraction as Frame-Semantic Parsing.</article-title>
          .
          <source>In * SEM@ NAACL-HLT</source>
          .
          <fpage>159</fpage>
          -
          <lpage>164</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Nal</given-names>
            <surname>Kalchbrenner</surname>
          </string-name>
          , Edward Grefenstette, and
          <string-name>
            <given-names>Phil</given-names>
            <surname>Blunsom</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>A convolutional neural network for modelling sentences</article-title>
          .
          <source>arXiv preprint arXiv:1404.2188</source>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Yoon</given-names>
            <surname>Kim</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Convolutional Neural Networks for Sentence Classification</article-title>
          .
          <source>In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29</source>
          ,
          <year>2014</year>
          , Doha,
          <string-name>
            <surname>Qatar,</surname>
          </string-name>
          <article-title>A meeting of SIGDAT, a Special Interest Group of the ACL</article-title>
          .
          <volume>1746</volume>
          -
          <fpage>1751</fpage>
          . http://aclweb.org/anthology/D/ D14/D14-1181.pdf
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Alex</given-names>
            <surname>Krizhevsky</surname>
          </string-name>
          , Ilya Sutskever, and
          <string-name>
            <given-names>Geofrey E</given-names>
            <surname>Hinton</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>ImageNet Classification with Deep Convolutional Neural Networks</article-title>
          .
          <source>In Advances in Neural Information Processing Systems</source>
          25,
          <string-name>
            <given-names>F.</given-names>
            <surname>Pereira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. J. C.</given-names>
            <surname>Burges</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Bottou</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K. Q.</given-names>
            <surname>Weinberger</surname>
          </string-name>
          (Eds.). Curran Associates, Inc.,
          <fpage>1097</fpage>
          -
          <lpage>1105</lpage>
          . http://papers.nips.cc/paper/ 4824-imagenet
          <article-title>-classification-with-deep-convolutional-neural-networks</article-title>
          .pdf
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Qi</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Heng</given-names>
            <surname>Ji</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Liang</given-names>
            <surname>Huang</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Joint Event Extraction via Structured Prediction with Global Features.</article-title>
          .
          <source>In ACL (1)</source>
          .
          <fpage>73</fpage>
          -
          <lpage>82</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Shasha</given-names>
            <surname>Liao</surname>
          </string-name>
          and
          <string-name>
            <given-names>Ralph</given-names>
            <surname>Grishman</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Using document level cross-event inference to improve event extraction</article-title>
          .
          <source>In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics</source>
          ,
          <fpage>789</fpage>
          -
          <lpage>797</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Shulin</surname>
            <given-names>Liu</given-names>
          </string-name>
          , Yubo Chen, Shizhu He, Kang Liu, and
          <string-name>
            <given-names>Jun</given-names>
            <surname>Zhao</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Leveraging FrameNet to Improve Automatic Event Detection</article-title>
          .
          <source>In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7-12</source>
          ,
          <year>2016</year>
          , Berlin, Germany, Volume
          <volume>1</volume>
          :
          <string-name>
            <given-names>Long</given-names>
            <surname>Papers</surname>
          </string-name>
          . http://aclweb.org/anthology/ P/P16/P16-1201.pdf
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Shulin</surname>
            <given-names>Liu</given-names>
          </string-name>
          , Yubo Chen, Kang Liu, and
          <string-name>
            <given-names>Jun</given-names>
            <surname>Zhao</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Exploiting Argument Information to Improve Event Detection via Supervised Attention Mechanisms</article-title>
          .
          <source>In Proceedings of the 55th Annual</source>
          <article-title>Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)</article-title>
          .
          <source>Association for Computational Linguistics</source>
          ,
          <fpage>1789</fpage>
          -
          <lpage>1798</lpage>
          . https://doi.org/10.18653/v1/
          <fpage>P17</fpage>
          -1164
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>David</surname>
            <given-names>McClosky</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Mihai</given-names>
            <surname>Surdeanu</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Christopher D.</given-names>
            <surname>Manning</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Event Extraction As Dependency Parsing</article-title>
          .
          <source>In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume</source>
          <volume>1</volume>
          (
          <issue>HLT</issue>
          '11).
          <article-title>Association for Computational Linguistics</article-title>
          , Stroudsburg, PA, USA,
          <fpage>1626</fpage>
          -
          <lpage>1635</lpage>
          . http://dl.acm.org/citation.cfm?id=
          <volume>2002472</volume>
          .
          <fpage>2002667</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Tomas</surname>
            <given-names>Mikolov</given-names>
          </string-name>
          , Wen-tau
          <string-name>
            <surname>Yih</surname>
            , and
            <given-names>Geofrey</given-names>
          </string-name>
          <string-name>
            <surname>Zweig</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Linguistic regularities in continuous space word representations</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Thien</given-names>
            <surname>Huu</surname>
          </string-name>
          <string-name>
            <surname>Nguyen</surname>
          </string-name>
          , Kyunghyun Cho, and
          <string-name>
            <given-names>Ralph</given-names>
            <surname>Grishman</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Joint Event Extraction via Recurrent Neural Networks.</article-title>
          .
          <source>In HLT-NAACL</source>
          .
          <fpage>300</fpage>
          -
          <lpage>309</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Thien</given-names>
            <surname>Huu</surname>
          </string-name>
          Nguyen and
          <string-name>
            <given-names>Ralph</given-names>
            <surname>Grishman</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Event Detection and Domain Adaptation with Convolutional Neural Networks</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>M.</given-names>
            <surname>Schuster</surname>
          </string-name>
          and
          <string-name>
            <given-names>K.K.</given-names>
            <surname>Paliwal</surname>
          </string-name>
          .
          <year>1997</year>
          .
          <article-title>Bidirectional Recurrent Neural Networks</article-title>
          .
          <source>Trans. Sig. Proc. 45</source>
          ,
          <issue>11</issue>
          (nov
          <year>1997</year>
          ),
          <fpage>2673</fpage>
          -
          <lpage>2681</lpage>
          . https://doi.org/10.1109/78.650093
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Joseph</surname>
            <given-names>Turian</given-names>
          </string-name>
          , Lev Ratinov, and
          <string-name>
            <given-names>Yoshua</given-names>
            <surname>Bengio</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Word representations: a simple and general method for semi-supervised learning</article-title>
          .
          <source>In Proceedings of the 48th annual</source>
          <article-title>meeting of the association for computational linguistics</article-title>
          .
          <source>Association for Computational Linguistics</source>
          ,
          <fpage>384</fpage>
          -
          <lpage>394</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>Bishan</given-names>
            <surname>Yang</surname>
          </string-name>
          and
          <string-name>
            <given-names>Tom M.</given-names>
            <surname>Mitchell</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Joint Extraction of Events and Entities within a Document Context</article-title>
          .
          <source>CoRR abs/1609</source>
          .03632 (
          <year>2016</year>
          ). arXiv:
          <volume>1609</volume>
          .03632 http://arxiv.org/abs/1609.03632
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>