<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Survey of Textual Event Extraction from Social Networks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mohamed MEJRI</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jalel AKAICHI</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>In the last decade, mining textual content on social networks to extract relevant data and useful knowledge is becoming an omnipresent task. One common application of text mining is Event Extraction, which is considered as a complex task divided into multiple sub-tasks of varying diculty. In this paper, we present a survey of the main existing text mining techniques which are used for many dierent event extraction aims. First, we present the main data-driven approaches which are based on statistics models to convert data to knowledge. Second, we expose the knowledgedriven approaches which are based on expert knowledge to extract knowledge usually by means of pattern-based approaches. Then we present the main existing hybrid approaches that combines data-driven and data-knowledge approaches. We end this paper with a comparative study that recapitulates the main features of each presented method. Key-words: Event Extraction, Text Mining, Information Extraction, Social Network.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Social Networks are dened as web-based systems (dedicated websites or other
application) that allow users (individuals) to create public or semi-public proles and
communicate with each other, within the internet network, by posting information,
comments, messages, videos, etc. [
        <xref ref-type="bibr" rid="ref4 ref8">4, 8</xref>
        ].
      </p>
      <p>
        In recent years, Social networks have become omnipresent because of the increasing
propagation and aordability of internet enabled devices such as personal
computers, smart phones, tablets and many other devices that allow users to connect
to social networks through the internet services [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. These new possibilities allow
people from everywhere and anytime to add, update, share and consult massive
quantities of new information in real time. These huge quantities of new
information added by hundreds of millions of active users [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] are considered as a very
important source of data for many research elds.
      </p>
      <p>
        These massive quantities of data are characterized by three computational issues:
size, noise and dynamism [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. These issues make manual analysis of social network
data seems to be impossible. To remedy this problem, data mining provides a wide
range of techniques for detecting useful knowledge from massive datasets. Most of
data social network is initially unstructured and habitually described using human
natural language, which makes the understanding and interpretation of social
network content by machine a dicult task [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. This problem impedes the automation
of Text Mining (TM) sub-tasks such as Information Retrieval [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]and Information
Extraction (IE) [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] processes which are frequently used in the decision making.
In general, we can dene Text Mining (TM) as the analysis of data contained in
natural language text. TM works by transposing words and phrases in
unstructured data into numerical values which can then be linked with structured data in
a database and analyzed with traditional data mining techniques [
        <xref ref-type="bibr" rid="ref40">40</xref>
        ]. By means of
text mining, often using Natural Language Processing (NLP) techniques,
information is extracted from texts of various sources, such as news messages and blogs, and
is represented and stored in a structured way, (generally in databases). A specic
type of knowledge that can be extracted from text by means of TM is an event,
which can be represented as a complex combination of relations linked to a set of
empirical observations from texts [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Event Extraction (EE) from textual content
of social network has gained remarkable attention in the last few years. For
example, this representation &lt;person&gt; &lt;attack&gt; &lt;person&gt; presents an attack event.
Words identied in text referring to persons are linked to the concept &lt;person&gt;;
verbs having the meaning of attack are associated with &lt;attack&gt;. Thus, a similar
event representation can be detected from texts such as: John shot his friend, A
woman was attacked by a stranger. Etc.
      </p>
      <p>
        Saval et al. [
        <xref ref-type="bibr" rid="ref33">33</xref>
        ] proposed a semantic extension for the modeling of events type
"natural disasters". They dene an event E as the combination of three components:
a semantic property S, a time interval I, and a spatial entity SP. Thus, an event
is represented as follows: E &lt; I; SP; S&gt;. In their work of 2014, Serrano et al.
[
        <xref ref-type="bibr" rid="ref34">34</xref>
        ] adapted this event representation by enriching it with an additional component
A corresponding to the dierent participants involved in the event. Thus, the
representation will be as follows: &lt;I, SP, S, A&gt; where A is a set of participants
playing one or more role(s). A participant noted Pi wherein 0 &lt; i &lt; n, and
a role noted rj wherein 0 &lt; j &lt; k. Component A is then dened as follows:
A = f(P ; r )g as the participant P plays the role r in the concerned event.
Event extraction from unstructured textual content could be useful for IE systems
in various ways. In fact, being able to detect and recuperate events could enhance
the quality and performance of personalized systems [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Therefore, the use of
extracted events form textual content of social networks to deal with several issues
is becoming an unavoidable task. However, Extracting events is a very dicult task
divided to many sub-tasks with dierent complexities and need the combination of
many techniques and methods depending on the treaty task.
      </p>
      <p>In this paper, we present a survey of the main existing approaches in literature for
EE. In the rst section, we present the data-driven event extraction approaches,
which are based on methods relying on statistics to convert data to knowledge,
then, we expose the main knowledge-driven approaches which extract knowledge
through representation and exploitation of expert knowledge, usually by means of
pattern-based approaches. The last part of the rst section will be devoted to the
presentation of dierent hybrid methods based on the combination of data-driven
and knowledge-driven approaches. In section 3, we present a quick overview of the
main multilingual event extraction systems used in the recent literature. In the
third section, we discuss the main existing works that combine event extraction and
risk management. And we end this papers with a comparative study in which we
demonstrate the main dierences, advantages and disadvantages for each approach.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Event extraction from textual content</title>
      <p>In the available annotated corpora geared toward information extraction, we see
two models of events, emphasizing these dierent aspects. On the one hand, there
is the TimeML model, in which an event is a word that points to a node in a
network of temporal relations. On the other hand, there is the ACE model, in which
an event is a complex structure, relating arguments that are themselves complex
structures, but with only ancillary temporal information (in the form of temporal
arguments, which are only noted when explicitly given). In the TimeML model,
every event is annotated, because every event takes part in the temporal network.
In the ACE model, only interesting events (events that fall into one of 34
predened categories) are annotated. The task of automatically extracting ACE events is
more complex than extracting TimeML events (in line with the increased
complexity of ACE events), involving detection of event anchors, assignment of an array of
attributes, identication of arguments and assignment of roles, and determination
of event coreference.</p>
      <p>
        Events in the ACE program
The ACE program1 provides annotated data, evaluation tools, and periodic
evaluation exercises for a variety of information extraction tasks. There are ve basic
kinds of extraction targets supported by ACE: entities, times, values, relations, and
events. The ACE tasks for 2005 are more fully described in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>ACE entities fall into seven types (person, organization, location, geo-political
entity, facility, vehicle, weapon), each with a number of subtypes. Within the ACE
program, a distinction is made between entities and entity mentions (similarly
between event and event mentions, and so on). An entity mention is a referring
expression in text (a name, pronoun, or other noun phrase) that refers to
something of an appropriate type. An entity, then, is either the actual referent, in the
world, of an entity mention or the cluster of entity mentions in a text that refer to
the same actual entity. The ACE Entity Detection and Recognition task requires
both the identication of expressions in text that refer to entities (i.e., entity
mentions) and coreference resolution to determine which entity mentions refer to the
same entities.</p>
      <p>ACE events, like ACE entities, are restricted to a range of types. Thus, not all
events in a text are annotatedonly those of an appropriate type. The eight
event types (with subtypes in parentheses) are Life (Be-Born, Marry, Divorce,
Injure, Die), Movement (Transport), Transaction (Transfer-Ownership,
TransferMoney), Business (Start-Org, Merge-Org, Declare-Bankruptcy, EndOrg), Conict
(Attack, Demonstrate), Contact (Meet, Phone-Write), Personnel (Start-Position,
End-Position, Nominate, Elect), Justice (ArrestJail, Release-Parole, Trial-Hearing,
Charge-Indict, Sue, Convict, Sentence, Fine, Execute, Extradite, Acquit, Appeal,
Pardon). Since there is nothing inherent in the task that requires the two levels of
type and subtype, for the remainder of the paper, we will refer to the combination
of event type and subtype (e.g., Life:Die) as the event type. In addition to their
type, events have four other attributes (possible values in parentheses): modality
(Asserted, Other), polarity (Positive, Negative), genericity (Specic, Generic), tense
(Past, Present, Future, Unspecied).</p>
      <p>
        The most distinctive characteristic of events (unlike entities, times, and values, but
like relations) is that they have arguments. Each event type has a set of possible
argument roles, which may be lled by entities, values, or times. In all, there are 35
role types, although no single event can have all 35 roles. A complete description
of which roles go with which event types can be found in the annotation guidelines
for ACE events [
        <xref ref-type="bibr" rid="ref38">38</xref>
        ]. Events, like entities, are distinguished from their mentions in
text. An event mention is a span of text (an extent, usually a sentence) with a
distinguished anchor (the word that most clearly expresses [an event’s] occurrence
[
        <xref ref-type="bibr" rid="ref38">38</xref>
        ]) and zero or more arguments, which are entity mentions, timexes, or values in
the extent. An event is either an actual event, in the world, or a cluster of event
mentions that refer to the same actual event. Note that the arguments of an event
are the entities, times, and values corresponding to the entity mentions, timexes,
and values that are arguments of the event mentions that make up the event. The
ocial evaluation metric of the ACE program is ACE value, a cost-based metric
which associates a normalized, weighted cost to system errors and subtracts that
cost from a maximum score of 100%. For events, the associated costs are largely
determined by the costs of the arguments, so that errors in entity, timex, and value
recognition are multiplied in event ACE value. Since it is useful to evaluate the
performance of event detection and recognition independently of the recognition of
entities, times, and values, the ACE program includes diagnostic tasks, in which
partial ground truth information is provided. Of particular interest here is the
diagnostic task for event detection and recognition, in which ground truth entities,
values, and times are provided.
      </p>
      <p>
        According to ACE terminology, event trigger is the word that determines the
event occurrence; argument is an entity mention, a value or a temporal expression
that constitutes event attributes and event mention is an extent of text with the
distinguished trigger, entity mentions and other argument types [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
As mentioned above, event extraction is a complex task divided on many
subtasks; therefore, many techniques for event extraction from textual content exist
in literature. As will be shown in this paper, the choice of suitable techniques is
based on the nal requirements of each extraction task. In this section, we present
a survey on the main methods and approaches sued in recent literature: the
datadriven approaches, knowledge-driven approaches and the hybrid approaches, we end
this section by a comparative study that recapitulating the main features, elds of
application, advantages and disadvantages of each approach.
2.1
      </p>
      <sec id="sec-2-1">
        <title>Data-driven approaches for event extraction</title>
        <p>In contrast to pattern-based approaches (which are presented in section 2.2),
datadriven approaches automatically build models for a particular NLP tasks (i.e. to
automated language processing) with no human intervention. In other words, these
approaches try to discover statistical relations through the use of only quantitative
methods such as probabilistic modeling, information theory, and linear algebra. So,
to develop these models that approximate linguistic phenomena, data-driven
methods necessitate a large text corpora, which is why these techniques often are called
corpus-based. Examples of discovered facts are words or concepts that are
(statistically) associated with one another. In recent literature, many techniques associated
to data-driven approaches could be used such as: word frequency counting, Term
Frequency - Inverse Document Frequency (TF-IDF), word sense disambiguation
(WSD), n-grams, and clustering.</p>
        <p>
          One common task in data-driven approaches for event extraction from text is the
Part-of-Speech (POS) tagging which is the process of assigning a part-of-speech to
each word in a sentence. In their work of 2006, Guy et al [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] elaborated on a
comparison between four data-driven taggers (TnT, MBT, SVMTool and MXPOST).
The experiments obtained through the application of these data-driven taggers on
a given dataset (the annotated Helsinki Corpus of Swahili) shows that MXPOST
as being the most accurate tagger for this dataset. In another set of experiments,
they further improved on the performance of the individual taggers by combining
them into a committee of taggers. Likewise, the obtained results showed that
combining many taggers may enhance the performance and accuracy of system. In the
same eld and to deal with for morphologically complex languages Mark et Joel
[
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] extended a statistical tagger to handle ne grained tagsets and improve over
the best Icelandic POS tagger. Additionally, they develop a case tagger for
nonlocal case and gender decisions. Delia et al. [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ] investigated dierent unsupervised
techniques for extracting and clustering complex events from news articles. As a
rst step they proposed two complementary event extraction algorithms, based on
identifying verbs and their arguments and shortest paths between entities,
respectively. Next, they obtained more general representations of the event mentions by
annotating the event trigger and arguments with concepts from knowledge bases.
The generalized arguments were used as features for a clustering approach, thus
determining related events.
        </p>
        <p>
          In their work of 2014, Deyu et al [
          <xref ref-type="bibr" rid="ref41">41</xref>
          ] elaborated on a simple Bayesian modeling
approach to event extraction from Twitter, called Latent Event Model (LEM), to
extract structured representation of events from social media. However, the proposed
approach is fully unsupervised and does not require annotated data for training.
So, the proposed model only requires the identication of named entities, locations
and time expressions. After that, the model can automatically extract events which
involving a named entity at certain time, location, and with event-related keywords
based on the co-occurrence patterns of the event elements. Okamoto et al. [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ]
presented a method for the detection of occasional or volatile local events using
topic extraction technologies. They elaborate on a framework based on a two-level
hierarchical clustering method. The resort to clustering techniques gave
acceptable results with a good accuracy for event extraction. Liu et al. [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] presented
a framework for simultaneous key entities extraction and signicant events mining
from daily web news based on clustering, modeling entities and weighted undirected
bipartite graph. In the same led, the authors of [
          <xref ref-type="bibr" rid="ref37">37</xref>
          ] developed a real-time news
event extraction system based on automatic pattern learning from a small
annotated corpus and in order to guarantee that massive amounts of textual data can
be digested in real time, they have developed ExPRESS (Extraction Pattern
Engine and Specication Suite), a highly ecient extraction pattern engine, which is
capable of matching thousands of patterns within seconds. In [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ], Lei et al.
presented a framework for extracting and tracking topic relevant event based on SVM
algorithm.
        </p>
        <p>The use data-driven approaches for event extraction give a main advantage: there
is no need to expert knowledge or linguistic resources. However, data-driven
approaches require large text corpora in order to develop models that approximate
linguistic phenomena. Another drawback is that data-driven methods do not deal
with the meaning of text. To remedy this problem, researchers resort to
knowledgedriven approaches which are based on patterns that express rules representing expert
knowledge.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Knowledge-driven approaches for event extraction</title>
        <p>
          Also known as Rule-Based methods, knowledge-driven methods are commonly based
on patterns constructed by linguists. Patterns consist of lexically specied syntactic
templates that are matched to text, in much the same way as regular expressions,
which are applied along with type constraints on substrings of the match. These
patterns are lexically indexed local grammar fragments, annotated with semantic
relations between the various arguments and the knowledge representation [
          <xref ref-type="bibr" rid="ref39">39</xref>
          ]. So,
these rules or patterns are relying on linguistic knowledge about the structure of
language and written in a formal notation so that they used by the computer for
further parsing [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ]. The design of patterns (that may be lexico-syntactic or
lexicosemantic pattern) and the choose of appropriate techniques are generally depends
on many factors such as the language of the text that is to be processed and the
nal purpose of processing. For the lexico-syntactic case, patterns combine
lexical and syntactical information [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ] while for the case of lexico-semantic patterns
are employed by the addition of semantic information generally through the use of
gazetteers [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] or ontologies [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ].
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>Lexico-syntactic patterns</title>
        <p>
          As we mention before, lexico-syntactic patterns is a combinations between
lexical representations ( i.e., strings) and syntactical information (e.g., Part-Of-Speech).
For further clarication, we present the following lexico-syntactic pattern given by
Hearst in his work of 1998 [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]:
such NP as {NP,}
        </p>
        <p>{(or | and)} NP</p>
        <p>Where he aimed to nd hyponym and hypernym relations by discovering
regular expression patterns in free text. In this pattern, NP indicates a proper noun.
Other text (i.e., such, as, or, and and) is used for lexical matching, while (
and ) contain conjunction and disjunction statements to be evaluated, in this case
a disjunction (denoted as j). Also, is a repetition parameter that indicates the
sequence between braces ( and ) is allowed to repeat zero to an innite number
of times. Apply this lexico-syntactic pattern on this sentence . . . works by such
authors as Herrick, Goldsmith, and Shakespeare gives the following results:
hyponym("author", "Herrick")
hyponym("author", "Goldsmith")
hyponym("author", "Shakespeare")</p>
        <p>These patterns are often easy to comprehend by regular users, yet dening the
right patterns to mine corpora to obtain unknown information is not a trivial task.
Hearst stresses that, in order to return desired results successfully, patterns should
be dened in such a way that they occur frequently and in many text genres. Also,
they should often indicate the relation of interest and should be recognizable with
little or no pre-encoded knowledge. Furthermore, all existing syntactic variations
have to be included into a complex pattern to ensure its proper working.</p>
      </sec>
      <sec id="sec-2-4">
        <title>Lexico-semantic patterns</title>
        <p>
          Lexico-semantic patterns are employed to remedy problems of the absence of
concepts that have specic meaning (mean by the use of lexico-syntactic patterns).
In addition to the combination of lexical representations and syntactical
information used by lexico-syntactic patterns, lexico-semantic patterns also permit for the
usage of semantic information such as concepts that are dened in ontologies. So,
Lexico-semantic patterns combine lexical representations with syntactic and
semantic information. Lexico-semantic patterns are rst presented by [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] in their work
of 1991, where they made a system for text processing based on lexico-semantic
patterns. These patterns could include terms and operators like lexical features,
logical combinations, and repetition, which are mostly adopted from the regular
expression language.
        </p>
        <p>
          The following example is given by Wooter el al [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] is a lexico-semantic pattern that
will classify the verb phrase left dead as to express death or injury:
?PIVOT = (or found left shot)
?OBJ = ?EFFECT=dead
=&gt; (mark-activator
murder d-vp) ;
        </p>
        <p>
          This sentence would also match found dead and shot dead. Next to standard
elements such as repetition and wildcards, the rule presented here contains features
like variable assignment on the left-hand side (LHS) (where words preceded by ?
denote variables) and on the right-hand side (RHS) macros such as mark-activator,
which uses the results of the pattern match, including variable assignments, along
with some other constants, such as murder and d-vp, to tag and segment the
text. The use of lexico-semantic patterns gives many advantages, the most
important is that they take into account the domain semantics which help the parser
cope with the complexity and exibility of unstructured text written with natural
language [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ].
        </p>
        <p>
          In the current body of literature, many works based on knowledge-driven
approaches for event extraction exists. For instance, in their work of 2012, Wooter et
al [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] proposed a rule-based method to learn ontology instances from text, where
they dened a lexico-semantic pattern language that, in addition to the lexical and
syntactical information present in lexico-syntactic rules, also makes use of semantic
information.
        </p>
        <p>
          In [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], authors proposed the use of lexico-semantic patterns for extracting nancial
events from RSS news feeds in order to allow investors on nancial markets to
monitor nancial events when deciding on buying and selling equities. These patterns
use nancial ontologies, leveraging the commonly used lexico-syntactic patterns to a
higher abstraction level, thus enabling lexico-semantic patterns to recognize
increasingly precisely events than lexico-syntactic patterns from text. For that, authors
have developed rules based on lexico-semantic patterns used to nd events, and
semantic actions that allow for updating the domain ontology with the eects of the
discovered events. There, pattern creation was based on the triple paradigm (i.e.,
it makes use of a subject, a predicate, and an optional object), and that relies on
triple conversion to the Java Annotations Pattern Engine 1 (JAPE) language [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]
and SPARQL2 [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. Another work for economic event extraction is also presented
for the same authors [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ], in which they proposed a semantic-based information
extraction pipeline for economic event detection, which makes use of lexico-semantic
patterns that are dened in the JAPE language. Other works in the same eld
could be found in [
          <xref ref-type="bibr" rid="ref35">35</xref>
          ], [
          <xref ref-type="bibr" rid="ref36">36</xref>
          ].
        </p>
        <p>The resort to knowledge-driven approaches has alleviated many problems
gured in case of data-driven approaches. The rst issue xed by the employ of
knowledge-driven approaches is that we don’t need to use a huge amount of training
data (text corpora demanded by data-driven approaches) to develop models that
approximate linguistic phenomena. The second important advantage is that the
remedy to knowledge-driven approaches oers the possibility to rely on a
combination of lexical, syntactical and semantic elements to dene powerful patterns which
can be used to extract and recognize very specic information. Nevertheless, one
common negative point concerns knowledge-driven approaches is that prior domain
knowledge is required, so we need to ask for expert linguist help, in other words, ,
in order to be able to dene patterns that retrieve the correct, desired information,
lexical knowledge and possibly also prior domain knowledge is required. Also, the
resort only to knowledge-driven approaches may cause troubles and returns weak
results especially when we need to recognize a big number of events.
2.3</p>
      </sec>
      <sec id="sec-2-5">
        <title>Hybrid approaches for event extraction</title>
        <p>
          Staying within the limits of one type of event extraction approaches may not give
the best results. So, combining data-driven approaches with knowledge-driven ones
possibly will alleviate drawbacks of each kind and this actually creates a new kind
of approaches: the hybrid approaches. In practice, it’s dicult to rely only on one
kind of event extraction approaches. Therefore, the majority of works in the
recent literature relies on hybrid approaches. Generally, and during the application
of hybrid approaches, data-driven approaches are generally used for the statistical
processing (bootstrapping, POS tagging, initial clustering, etc) while
knowledgedriven approaches are used for dening powerful expressions generally by means
of lexical, syntactical and semantic elements [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ]. In other words, data-driven
approaches used to deal with huge amount of data while knowledge-driven used to
deal with specic meaning aims.
        </p>
        <p>
          Kenji et al. [
          <xref ref-type="bibr" rid="ref32">32</xref>
          ] presented an approach to combine rule-based and data-driven NLP
        </p>
      </sec>
      <sec id="sec-2-6">
        <title>1 https://gate.ac.uk/sale/tao/splitch8.html</title>
        <p>
          2 http://www.w3.org/TR/rdf-sparql-query/
techniques in the extraction of grammatical relations. They have shown that
starting with a rule-based system, we can use unlabeled data and a corpus-based system
to improve recall (and F-score) of grammatical relations. In their work of 2004,
Camiano et al. [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] elaborated on a hybrid approach to resolve issues caused by the
lack of expert knowledge, so they resort to statistical methods to remedy these
issues. Pakhomov et al. [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ] combined statistical methods with lexical knowledge. A
similar orientation could be found in [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ] in this case, authors used hybrid approach
to reinforce statistical methods. The authors of [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ] bootstrap a weakly supervised
pattern learning algorithm with clusters, in order to extract violence incidents from
online news with high precision and recall, and storing these in knowledge bases.
The authors of [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ] employ a grammar-based statistical method to text mining, i.e.,
POS tagging. However, tagging is based on domain knowledge that is stored in
ontologies, thus making the event extraction a hybrid process. Finally, Chun et al.
[
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] extract events from biomedical literature by means of lexico-syntactic patterns,
combined with term co-occurrences.
        </p>
        <p>The combination of data-driven approaches with knowledge-driven ones bring
several enhancements. For instance, and even still need a big amount of data to
develop statistical models, the required amount of data in hybrid approaches is less
than in the case of purely data-driven approaches. The same, the required amount
of developed patterns by experts for detecting events is less than purely
knowledgedriven approaches and this is due to the resort to statistical methods to discover
events automatically. Drawbacks are generally caused by the complexity of hybrid
systems which encompasses many techniques and methods of data-driven and
dataknowledge approaches.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Discussion</title>
      <p>In this section, we summarized the dierent discussed approaches and methods in
a table (Table 1), in which we tried to expose the main dierences between each
approach. To do so, we listed, the dierent techniques used for each approach
(Datadriven or knowledge driven approaches) then the used methods for each approach
(hierarchical, graphs, SVM. . . ) and the dierent types of events. We presented,
also, the amount of required data needed for each approach and nally the required
domain knowledge and expertise). As shown in Table 1, in term of data usage,
knowledge driven based approaches require fewer amounts of data. Experiments
shows that we need only couple hundreds of documents or sentences to generate
valuable and accurate event extraction rules. On the other hand, data-driven
approaches require more than ten thousands documents to build useful statistical
models that give acceptable results. For the hybrid approaches that combine
datadriven and knowledge-driven methods, the amount of required data still elevated
but it’s much better than the case of Data-driven approaches, where we rely solely
on statistical techniques to extract rules. For the interpretability, knowledge-driven
approaches give the best results, especially for the case of lexico-semantic patterns
that performs the high level of interpretability. The data-driven approaches give
the lowest accurate. Based on the results given by this survey, and in order to chose
the appropriate techniques and methods for event extraction, we recommend the
resort to knowledge-driven approaches for specic domains, due the ease, the
simplicity and the high accurate of rules based approaches. Also we need less amount
of data to generate useful models. In the other hand, we recommend data-driven
and hybrid approaches for users who deal with huge amount and variety of data to
extract various types of events.</p>
      <p>M M M M M
d d d9 d d
e e e e e
e p w w w w w w
s x o o o o o o
i
t E L L L L L L
r
e
d
e K L L L L L L
l
d a
n t d h d h h d
a a e ig e ig ig e
m
e D M H M H H M
d
h a
c t
e a
]
]
9 ]
3 2 ]
]7 [.l [.2 [02 ]6 ]1 l.a [92
3 a l . [1 [2 t l.
[ ] a l</p>
      <p>e a
.l 42 te t a .l l. t
a [ n ] e t a a m e
te l.a am [25 ren ea te tse oob isk
w
o
n
d
i
r
b
y
A G M D D O L
K C P L C</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusions</title>
      <p>We present in this survey the main approaches in current literature, for event
extraction from text. As shown, data-driven approaches (corpus based approaches)
require a huge amount of data to discover statistical relations through the use of
quantitative methods such as probabilistic modeling, information theory, and
linear algebra to develop models that approximate linguistic phenomena, So these
approaches require little domain knowledge and expertise. The main advantage
of corpus based methods is that we don’t need expert knowledge but we get low
interpretability as a result. For the knowledge-driven approaches, we rely
basically on patterns developed by experts but we need also a little amount of data
to develop these patterns. Pattern based approaches gives better results with high
interpretability but can’t deal with huge amount of data when we are looking for the
extraction of various types of events. The resort to hybrid approaches that combine
knowledge-driven and data-driven approaches seems to be a great solution to
remedy drawbacks of each family approach and get the advantages of both techniques:
patterns based and corpus based methods.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>ACE (Automatic Content Extraction) English Annotation Guidelines for Events</surname>
          </string-name>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <article-title>[2] SPARQL query language for RDF, W3C recommendation</article-title>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>K. C. H. V.</given-names>
            <surname>Aaltonen</surname>
          </string-name>
          , S and
          <string-name>
            <given-names>A.</given-names>
            <surname>Heinze</surname>
          </string-name>
          .
          <article-title>Social media in europe: Lessons from an online survey</article-title>
          .
          <source>Worcester College</source>
          , Oxford, UK,
          <year>2013</year>
          .
          <source>18th UKAIS Annual Conference: Social Information Systems.</source>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Adedoyin-Olowe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. M.</given-names>
            <surname>Gaber</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F. T.</given-names>
            <surname>Stahl</surname>
          </string-name>
          .
          <article-title>A survey of data mining techniques for social media analysis</article-title>
          .
          <source>CoRR, abs/1312.4617</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>P.</given-names>
            <surname>Anantharam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Barnaghi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Thirunarayan</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Sheth</surname>
          </string-name>
          .
          <article-title>Extracting city trac events from social streams</article-title>
          .
          <source>ACM Trans. Intell. Syst. Technol. , 6(4):43:1</source>
          <volume>43</volume>
          :
          <fpage>27</fpage>
          ,
          <year>July 2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>T.</given-names>
            <surname>Baldwin</surname>
          </string-name>
          .
          <article-title>Social media: Friend or foe of natural language processing</article-title>
          ?
          <source>In Proceedings of the 26th Pacic Asia Conference on Language, Information, and Computation</source>
          , pages
          <fpage>5859</fpage>
          ,
          <string-name>
            <surname>Bali</surname>
          </string-name>
          ,Indonesia,
          <year>November 2012</year>
          . Faculty of Computer Science, Universitas Indonesia.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Borsje</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Hogenboom</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Frasincar</surname>
          </string-name>
          .
          <article-title>Semi-automatic nancial events discovery based on lexico-semantic patterns</article-title>
          .
          <source>Int. J. Web Eng. Technol.</source>
          ,
          <volume>6</volume>
          (
          <issue>2</issue>
          ):
          <fpage>115140</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. V.</given-names>
            <surname>Kalashnikov</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Mehrotra</surname>
          </string-name>
          .
          <article-title>Exploiting context analysis for combining multiple entity resolution systems</article-title>
          .
          <source>In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data</source>
          , pages
          <fpage>207218</fpage>
          . ACM,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>P.</given-names>
            <surname>Cimiano</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Staab</surname>
          </string-name>
          .
          <article-title>Learning by googling</article-title>
          .
          <source>SIGKDD Explor</source>
          . Newsl. ,
          <volume>6</volume>
          (
          <issue>2</issue>
          ):
          <fpage>2433</fpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>H.</given-names>
            <surname>Cunningham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Maynard</surname>
          </string-name>
          , and
          <string-name>
            <given-names>V.</given-names>
            <surname>Tablan</surname>
          </string-name>
          .
          <article-title>JAPE: a Java Annotation Patterns Engine (Second Edition)</article-title>
          .
          <source>Research Memorandum CS0010</source>
          , Department of Computer Science, University of Sheeld,
          <year>November 2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>G. De Pauw</surname>
          </string-name>
          , G.
          <string-name>
            <surname>-M. de Schryver</surname>
            , and
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Wagacha</surname>
          </string-name>
          .
          <article-title>Data-driven part-of-speech tagging of kiswahili</article-title>
          .
          <source>In Text, Speech and Dialogue</source>
          , volume
          <volume>4188</volume>
          of Lecture Notes in Computer Science , pages
          <fpage>197204</fpage>
          . Springer Berlin Heidelberg,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>M.</given-names>
            <surname>Dredze</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Wallenberg</surname>
          </string-name>
          .
          <article-title>Icelandic data driven part of speech tagging</article-title>
          .
          <source>In ACL</source>
          <year>2008</year>
          ,
          <article-title>Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics</article-title>
          , June 15-20,
          <year>2008</year>
          , Columbus, Ohio, USA, Short Papers, pages
          <fpage>3336</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>H.</given-names>
            <surname>Farah</surname>
          </string-name>
          .
          <article-title>Extraction de concepts et de relations entre concepts ˆ partir des documents multilingues : Approche statistique et ontologique dissertation</article-title>
          .
          <source>PhD thesis</source>
          , Institut Nationale des Sciences Appliquˆ c es de Lyon, Lyon, France,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>B.-J. . L. L.</given-names>
            <surname>Frasincar</surname>
          </string-name>
          ,
          <string-name>
            <surname>F.</surname>
          </string-name>
          <article-title>A semantic web-based approach for building personalized news services</article-title>
          .
          <source>International Journal of E-Business Research (IJEBR)</source>
          ,
          <volume>5</volume>
          :
          <fpage>19</fpage>
          ,
          <year>2009</year>
          . 3.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>R.</given-names>
            <surname>Grishman</surname>
          </string-name>
          .
          <source>Information extraction: Capabilities and challenges. Lecture Notes</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Hearst</surname>
          </string-name>
          .
          <source>Automated discovery of wordnet relations</source>
          .
          <source>WordNet: an electronic lexical database</source>
          , pages
          <fpage>131153</fpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>F.</given-names>
            <surname>Hogenboom</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Frasincar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Kaymak</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F. D.</given-names>
            <surname>Jong</surname>
          </string-name>
          .
          <article-title>An overview of event extraction from text</article-title>
          . In Workshop on Detection, Representation, and
          <article-title>Exploitation of Events in the Semantic Web (DeRiVE</article-title>
          <year>2011</year>
          ) at Tenth International Semantic Web Conference (ISWC
          <year>2011</year>
          ). Volume 779 of CEUR Workshop Proceedings., CEURWS.org (
          <year>2011</year>
          ) ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>F.</given-names>
            <surname>Hogenboom</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hogenboom</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Frasincar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Kaymak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O. van der</given-names>
            <surname>Meer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Schouten</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Vandic</surname>
          </string-name>
          .
          <article-title>Speed: A semantics-based pipeline for economic event detection</article-title>
          . In J. Parsons,
          <string-name>
            <given-names>M.</given-names>
            <surname>Saeki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Shoval</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Woo</surname>
          </string-name>
          , and Y. Wand, editors,
          <source>Conceptual Modeling ER</source>
          <year>2010</year>
          , volume
          <volume>6412</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>452457</fpage>
          . Springer Berlin Heidelberg,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>W.</given-names>
            <surname>IJntema</surname>
          </string-name>
          , J.
          <string-name>
            <surname>Sangers</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Hogenboom</surname>
            , and
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Frasincar</surname>
          </string-name>
          .
          <article-title>A lexico-semantic pattern language for learning ontology instances from text</article-title>
          .
          <source>Web Semantics: Science, Services and Agents on the World Wide Web</source>
          ,
          <volume>15</volume>
          (
          <issue>3</issue>
          ),
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>W.</given-names>
            <surname>IJntema</surname>
          </string-name>
          , J.
          <string-name>
            <surname>Sangers</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Hogenboom</surname>
            , and
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Frasincar</surname>
          </string-name>
          .
          <article-title>A lexico-semantic pattern language for learning ontology instances from text</article-title>
          .
          <source>J. Web Sem</source>
          .,
          <volume>15</volume>
          :
          <fpage>37</fpage>
          50,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>P. S.</given-names>
            <surname>Jacobs</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. R.</given-names>
            <surname>Krupka</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L. F.</given-names>
            <surname>Rau</surname>
          </string-name>
          .
          <article-title>Lexico-semantic pattern matching as a companion to parsing in text understanding</article-title>
          .
          <source>In Proceedings of the Workshop on Speech and Natural Language</source>
          , pages
          <fpage>337341</fpage>
          , Stroudsburg, PA, USA,
          <year>1991</year>
          .
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>C.</given-names>
            <surname>Klaussner</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Zhekova</surname>
          </string-name>
          .
          <article-title>Lexico-syntactic patterns for automatic ontology building</article-title>
          .
          <source>In Proceedings of the Second Student Research Workshop associated with RANLP</source>
          <year>2011</year>
          , pages
          <fpage>109114</fpage>
          ,
          <string-name>
            <surname>Hissar</surname>
          </string-name>
          , Bulgaria,
          <year>September 2011</year>
          .
          <article-title>RANLP 2011 Organising Committee</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>C.-S.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.-J.</given-names>
            <surname>Chen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Z.-W.</given-names>
            <surname>Jian</surname>
          </string-name>
          .
          <article-title>Ontology-based fuzzy event extraction agent for chinese e-news summarization</article-title>
          .
          <source>Expert Syst. Appl.</source>
          ,
          <volume>25</volume>
          (
          <issue>3</issue>
          ):
          <fpage>431447</fpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Lei</surname>
          </string-name>
          , L.
          <string-name>
            <surname>-D. Wu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Zhang</surname>
            , and
            <given-names>Y.-C.</given-names>
          </string-name>
          <string-name>
            <surname>Liu</surname>
          </string-name>
          .
          <article-title>A system for detecting and tracking internet news event</article-title>
          . In Y.-S. Ho and H. J. Kim, editors,
          <source>PCM (1)</source>
          , volume
          <volume>3767</volume>
          of Lecture Notes in Computer Science , pages
          <fpage>754764</fpage>
          . Springer,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>B.</given-names>
            <surname>Megyesi</surname>
          </string-name>
          .
          <article-title>Data-Driven syntactic analysis methods and applications for Swedish</article-title>
          .
          <source>PhD thesis</source>
          , Doctoral dissertation Departement of Speech, Music and
          <string-name>
            <surname>Hearing</surname>
            <given-names>KTH</given-names>
          </string-name>
          , Kungliga Tekniska Hogskolan,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>C. S. Nicole</surname>
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Ellison</surname>
            and
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Lampe</surname>
          </string-name>
          .
          <article-title>The benets of facebook friends: Social capital and college students use of online social network sites</article-title>
          .
          <source>Computer Mediated Communication</source>
          ,
          <volume>12</volume>
          ,
          <year>July 2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>M.</given-names>
            <surname>Okamoto</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Kikuchi</surname>
          </string-name>
          .
          <article-title>Discovering volatile events in your neighborhood: Local-area topic extraction from blog entries</article-title>
          . In G. G. Lee,
          <string-name>
            <given-names>D.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.-Y.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Aizawa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kuriyama</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Yoshioka</surname>
          </string-name>
          , and T. Sakai, editors,
          <source>AIRS</source>
          , volume
          <volume>5839</volume>
          of Lecture Notes in Computer Science , pages
          <fpage>181192</fpage>
          . Springer,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>S.</given-names>
            <surname>Pakhomov</surname>
          </string-name>
          .
          <article-title>Semi-supervised maximum entropy based approach to acronym and abbreviation normalization in medical texts</article-title>
          .
          <source>In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics</source>
          , pages
          <fpage>160167</fpage>
          , Stroudsburg, PA, USA,
          <year>2002</year>
          .
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>J.</given-names>
            <surname>Piskorski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Tanev</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P. O.</given-names>
            <surname>Wennerberg</surname>
          </string-name>
          .
          <article-title>Extracting violent events from on-line news for ontology population</article-title>
          . In W. Abramowicz, editor,
          <source>BIS</source>
          , volume
          <volume>4439</volume>
          of Lecture Notes in Computer Science , pages
          <fpage>287300</fpage>
          . Springer,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>V.</given-names>
            <surname>Punyakanok</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Roth</surname>
          </string-name>
          , and W.-t. Yih.
          <article-title>The importance of syntactic parsing and inference in semantic role labeling</article-title>
          .
          <source>Comput. Linguist.</source>
          ,
          <volume>34</volume>
          (
          <issue>2</issue>
          ):
          <fpage>257287</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>D.</given-names>
            <surname>Rusu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hodson</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Kimball</surname>
          </string-name>
          .
          <article-title>Unsupervised techniques for extracting and clustering complex events in news</article-title>
          .
          <source>In Proceedings of the Second Workshop on EVENTS: Denition</source>
          , Detection, Coreference, and Representation , pages
          <fpage>2634</fpage>
          ,
          <string-name>
            <surname>Baltimore</surname>
          </string-name>
          , Maryland, USA,
          <year>June 2014</year>
          .
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>K.</given-names>
            <surname>Sagae</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lavie</surname>
          </string-name>
          , and
          <string-name>
            <surname>B. MacWhinney.</surname>
          </string-name>
          <article-title>Combining rule-based and datadriven techniques for grammatical relation extraction in spoken langugage</article-title>
          . In In Proceedings of the Eighth International Workshop in Parsing , pages
          <fpage>153162</fpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>A.</given-names>
            <surname>Saval</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bouzid</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Brunessaux</surname>
          </string-name>
          .
          <article-title>A semantic extension for event modelisation</article-title>
          .
          <source>In Tools with Articial Intelligence</source>
          ,
          <year>2009</year>
          . ICTAI '
          <volume>09</volume>
          . 21st International Conference on, pages
          <fpage>139146</fpage>
          ,
          <year>Nov 2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>V.</given-names>
            <surname>Soulignac</surname>
          </string-name>
          . SystŁme informatique de capitalisation de connaissances et d'
          <article-title>innovation pour la conception</article-title>
          et le pilotage de systŁmes de culture durables . Theses,
          <string-name>
            <surname>UniversitØ Blaise Pascal - Clermont-Ferrand</surname>
            <given-names>II</given-names>
          </string-name>
          , Oct.
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>S.</given-names>
            <surname>Staab</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Erdmann</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Maedche</surname>
          </string-name>
          .
          <article-title>Engineering Ontologies using Semantic Patterns</article-title>
          . Seattle,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>S.</given-names>
            <surname>Staab</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Erdmann</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Maedche</surname>
          </string-name>
          .
          <article-title>Engineering ontologies using semantic patterns</article-title>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>H.</given-names>
            <surname>Tanev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Piskorski</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Atkinson</surname>
          </string-name>
          .
          <article-title>Real-time news event extraction for global crisis monitoring</article-title>
          .
          <source>In Proceedings of the 13th International Conference on Natural Language and Information Systems: Applications of Natural Language to Information Systems</source>
          , pages
          <fpage>207218</fpage>
          , Berlin, Heidelberg,
          <year>2008</year>
          . SpringerVerlag.
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>C.</given-names>
            <surname>Walker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Strassel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Medero</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Maeda</surname>
          </string-name>
          .
          <article-title>Ace 2005 Multilingual Training Corpus</article-title>
          .
          <source>Linguistic Data Consortium</source>
          , Philadelphia ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Waterman</surname>
          </string-name>
          .
          <article-title>Structural methods for lexical/semantic patterns</article-title>
          ,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <given-names>I. H.</given-names>
            <surname>Witten</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Frank</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Trigg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hall</surname>
          </string-name>
          , G. Holmes, and
          <string-name>
            <surname>S. J. Cunningham. Weka:</surname>
          </string-name>
          <article-title>Practical machine learning tools and techniques with java implementations</article-title>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>D.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Chen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>He</surname>
          </string-name>
          .
          <article-title>A simple bayesian modelling approach to event extraction from twitter</article-title>
          .
          <source>In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)</source>
          , pages
          <fpage>700705</fpage>
          ,
          <string-name>
            <surname>Baltimore</surname>
          </string-name>
          , Maryland,
          <year>June 2014</year>
          .
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>