<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">A Survey of Textual Event Extraction from Social Networks</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Mohamed</forename><surname>Mejri</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Institut Supérieur de Gestion de Tunis</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Jalel</forename><surname>Akaichi</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Institut Supérieur de Gestion de Tunis</orgName>
							</affiliation>
						</author>
						<title level="a" type="main">A Survey of Textual Event Extraction from Social Networks</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">B563A733CD60C77FFDC97EAF2707401C</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T13:22+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Event Extraction</term>
					<term>Text Mining</term>
					<term>Information Extraction</term>
					<term>Social Network</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In the last decade, mining textual content on social networks to extract relevant data and useful knowledge is becoming an omnipresent task. One common application of text mining is Event Extraction, which is considered as a complex task divided into multiple sub-tasks of varying diculty. In this paper, we present a survey of the main existing text mining techniques which are used for many dierent event extraction aims. First, we present the main data-driven approaches which are based on statistics models to convert data to knowledge. Second, we expose the knowledgedriven approaches which are based on expert knowledge to extract knowledge usually by means of pattern-based approaches. Then we present the main existing hybrid approaches that combines data-driven and data-knowledge approaches. We end this paper with a comparative study that recapitulates the main features of each presented method.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Social Networks are dened as web-based systems (dedicated websites or other application) that allow users (individuals) to create public or semi-public proles and communicate with each other, within the internet network, by posting information, comments, messages, videos, etc. <ref type="bibr" target="#b3">[4,</ref><ref type="bibr" target="#b7">8]</ref>.</p><p>In recent years, Social networks have become omnipresent because of the increasing propagation and aordability of internet enabled devices such as personal computers, smart phones, tablets and many other devices that allow users to connect to social networks through the internet services <ref type="bibr" target="#b2">[3]</ref>. These new possibilities allow people from everywhere and anytime to add, update, share and consult massive quantities of new information in real time. These huge quantities of new information added by hundreds of millions of active users <ref type="bibr" target="#b25">[26]</ref> are considered as a very important source of data for many research elds. These massive quantities of data are characterized by three computational issues: size, noise and dynamism <ref type="bibr" target="#b3">[4]</ref>. These issues make manual analysis of social network data seems to be impossible. To remedy this problem, data mining provides a wide range of techniques for detecting useful knowledge from massive datasets. Most of data social network is initially unstructured and habitually described using human natural language, which makes the understanding and interpretation of social network content by machine a dicult task <ref type="bibr" target="#b5">[6]</ref>. This problem impedes the automation of Text Mining (TM) sub-tasks such as Information Retrieval <ref type="bibr" target="#b12">[13]</ref>and Information Extraction (IE) <ref type="bibr" target="#b14">[15]</ref> processes which are frequently used in the decision making.</p><p>In general, we can dene Text Mining (TM) as the analysis of data contained in natural language text. TM works by transposing words and phrases in unstructured data into numerical values which can then be linked with structured data in a database and analyzed with traditional data mining techniques <ref type="bibr" target="#b39">[40]</ref>. By means of text mining, often using Natural Language Processing (NLP) techniques, information is extracted from texts of various sources, such as news messages and blogs, and is represented and stored in a structured way, (generally in databases). A specic type of knowledge that can be extracted from text by means of TM is an event, which can be represented as a complex combination of relations linked to a set of empirical observations from texts <ref type="bibr" target="#b16">[17]</ref>. Event Extraction (EE) from textual content of social network has gained remarkable attention in the last few years. For example, this representation &lt;person&gt; &lt;attack&gt; &lt;person&gt; presents an attack event.</p><p>Words identied in text referring to persons are linked to the concept &lt;person&gt;; verbs having the meaning of attack are associated with &lt;attack&gt;. Thus, a similar event representation can be detected from texts such as: John shot his friend, A woman was attacked by a stranger. Etc. Saval et al. <ref type="bibr" target="#b32">[33]</ref> proposed a semantic extension for the modeling of events type "natural disasters". They dene an event E as the combination of three components: a semantic property S, a time interval I, and a spatial entity SP. Thus, an event is represented as follows: E &lt; I; SP; S&gt;. In their work of 2014, Serrano et al. <ref type="bibr" target="#b33">[34]</ref> adapted this event representation by enriching it with an additional component A corresponding to the dierent participants involved in the event. Thus, the representation will be as follows: &lt;I, SP, S, A&gt; where A is a set of participants playing one or more role(s). A participant noted P i wherein 0 &lt; i &lt; n, and a role noted r j wherein 0 &lt; j &lt; k. Component A is then dened as follows: A = {(P α, rβ)} as the participant P α plays the role rβ in the concerned event.</p><p>Event extraction from unstructured textual content could be useful for IE systems in various ways. In fact, being able to detect and recuperate events could enhance the quality and performance of personalized systems <ref type="bibr" target="#b13">[14]</ref>. Therefore, the use of extracted events form textual content of social networks to deal with several issues is becoming an unavoidable task. However, Extracting events is a very dicult task divided to many sub-tasks with dierent complexities and need the combination of many techniques and methods depending on the treaty task.</p><p>In this paper, we present a survey of the main existing approaches in literature for EE. In the rst section, we present the data-driven event extraction approaches, which are based on methods relying on statistics to convert data to knowledge, then, we expose the main knowledge-driven approaches which extract knowledge through representation and exploitation of expert knowledge, usually by means of pattern-based approaches. The last part of the rst section will be devoted to the presentation of dierent hybrid methods based on the combination of data-driven and knowledge-driven approaches. In section 3, we present a quick overview of the main multilingual event extraction systems used in the recent literature. In the third section, we discuss the main existing works that combine event extraction and risk management. And we end this papers with a comparative study in which we demonstrate the main dierences, advantages and disadvantages for each approach.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Event extraction from textual content</head><p>In the available annotated corpora geared toward information extraction, we see two models of events, emphasizing these dierent aspects. On the one hand, there is the TimeML model, in which an event is a word that points to a node in a network of temporal relations. On the other hand, there is the ACE model, in which an event is a complex structure, relating arguments that are themselves complex structures, but with only ancillary temporal information (in the form of temporal arguments, which are only noted when explicitly given). In the TimeML model, every event is annotated, because every event takes part in the temporal network.</p><p>In the ACE model, only interesting events (events that fall into one of 34 predened categories) are annotated. The task of automatically extracting ACE events is more complex than extracting TimeML events (in line with the increased complexity of ACE events), involving detection of event anchors, assignment of an array of attributes, identication of arguments and assignment of roles, and determination of event coreference.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Events in the ACE program</head><p>The ACE program1 provides annotated data, evaluation tools, and periodic evaluation exercises for a variety of information extraction tasks. There are ve basic kinds of extraction targets supported by ACE: entities, times, values, relations, and events. The ACE tasks for 2005 are more fully described in <ref type="bibr" target="#b0">[1]</ref>.</p><p>ACE entities fall into seven types (person, organization, location, geo-political entity, facility, vehicle, weapon), each with a number of subtypes. Within the ACE program, a distinction is made between entities and entity mentions (similarly between event and event mentions, and so on). An entity mention is a referring expression in text (a name, pronoun, or other noun phrase) that refers to something of an appropriate type. An entity, then, is either the actual referent, in the world, of an entity mention or the cluster of entity mentions in a text that refer to the same actual entity. The ACE Entity Detection and Recognition task requires both the identication of expressions in text that refer to entities (i.e., entity mentions) and coreference resolution to determine which entity mentions refer to the same entities. ACE events, like ACE entities, are restricted to a range of types. Thus, not all events in a text are annotatedonly those of an appropriate type. Charge-Indict, Sue, Convict, Sentence, Fine, Execute, Extradite, Acquit, Appeal, Pardon). Since there is nothing inherent in the task that requires the two levels of type and subtype, for the remainder of the paper, we will refer to the combination of event type and subtype (e.g., Life:Die) as the event type. In addition to their type, events have four other attributes (possible values in parentheses): modality (Asserted, Other), polarity (Positive, Negative), genericity (Specic, Generic), tense (Past, Present, Future, Unspecied).</p><p>The most distinctive characteristic of events (unlike entities, times, and values, but like relations) is that they have arguments. Each event type has a set of possible argument roles, which may be lled by entities, values, or times. In all, there are 35 role types, although no single event can have all 35 roles. A complete description of which roles go with which event types can be found in the annotation guidelines for ACE events <ref type="bibr" target="#b37">[38]</ref>. Events, like entities, are distinguished from their mentions in text. An event mention is a span of text (an extent, usually a sentence) with a distinguished anchor (the word that most clearly expresses [an event's] occurrence <ref type="bibr" target="#b37">[38]</ref>) and zero or more arguments, which are entity mentions, timexes, or values in the extent. An event is either an actual event, in the world, or a cluster of event mentions that refer to the same actual event. Note that the arguments of an event are the entities, times, and values corresponding to the entity mentions, timexes, and values that are arguments of the event mentions that make up the event. The ocial evaluation metric of the ACE program is ACE value, a cost-based metric which associates a normalized, weighted cost to system errors and subtracts that cost from a maximum score of 100%. For events, the associated costs are largely determined by the costs of the arguments, so that errors in entity, timex, and value recognition are multiplied in event ACE value. Since it is useful to evaluate the performance of event detection and recognition independently of the recognition of entities, times, and values, the ACE program includes diagnostic tasks, in which partial ground truth information is provided. Of particular interest here is the diagnostic task for event detection and recognition, in which ground truth entities, values, and times are provided.</p><p>According to ACE terminology, event trigger is the word that determines the event occurrence; argument is an entity mention, a value or a temporal expression that constitutes event attributes and event mention is an extent of text with the distinguished trigger, entity mentions and other argument types <ref type="bibr" target="#b14">[15]</ref>.</p><p>As mentioned above, event extraction is a complex task divided on many subtasks; therefore, many techniques for event extraction from textual content exist in literature. As will be shown in this paper, the choice of suitable techniques is based on the nal requirements of each extraction task. In this section, we present a survey on the main methods and approaches sued in recent literature: the datadriven approaches, knowledge-driven approaches and the hybrid approaches, we end this section by a comparative study that recapitulating the main features, elds of application, advantages and disadvantages of each approach.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Data-driven approaches for event extraction</head><p>In contrast to pattern-based approaches (which are presented in section 2.2), datadriven approaches automatically build models for a particular NLP tasks (i.e. to automated language processing) with no human intervention. In other words, these approaches try to discover statistical relations through the use of only quantitative methods such as probabilistic modeling, information theory, and linear algebra. So, to develop these models that approximate linguistic phenomena, data-driven methods necessitate a large text corpora, which is why these techniques often are called corpus-based. Examples of discovered facts are words or concepts that are (statistically) associated with one another. In recent literature, many techniques associated to data-driven approaches could be used such as: word frequency counting, Term Frequency -Inverse Document Frequency (TF-IDF), word sense disambiguation (WSD), n-grams, and clustering.</p><p>One common task in data-driven approaches for event extraction from text is the Part-of-Speech (POS) tagging which is the process of assigning a part-of-speech to each word in a sentence. In their work of 2006, Guy et al <ref type="bibr" target="#b10">[11]</ref> elaborated on a comparison between four data-driven taggers (TnT, MBT, SVMTool and MXPOST).</p><p>The experiments obtained through the application of these data-driven taggers on a given dataset (the annotated Helsinki Corpus of Swahili) shows that MXPOST as being the most accurate tagger for this dataset. In another set of experiments, they further improved on the performance of the individual taggers by combining them into a committee of taggers. Likewise, the obtained results showed that combining many taggers may enhance the performance and accuracy of system. In the same eld and to deal with for morphologically complex languages Mark et Joel <ref type="bibr" target="#b11">[12]</ref> extended a statistical tagger to handle ne grained tagsets and improve over the best Icelandic POS tagger. Additionally, they develop a case tagger for nonlocal case and gender decisions. Delia et al. <ref type="bibr" target="#b30">[31]</ref> investigated dierent unsupervised techniques for extracting and clustering complex events from news articles. As a rst step they proposed two complementary event extraction algorithms, based on identifying verbs and their arguments and shortest paths between entities, respectively. Next, they obtained more general representations of the event mentions by annotating the event trigger and arguments with concepts from knowledge bases.</p><p>The generalized arguments were used as features for a clustering approach, thus determining related events.</p><p>In their work of 2014, Deyu et al <ref type="bibr" target="#b40">[41]</ref> elaborated on a simple Bayesian modeling approach to event extraction from Twitter, called Latent Event Model (LEM), to extract structured representation of events from social media. However, the proposed approach is fully unsupervised and does not require annotated data for training.</p><p>So, the proposed model only requires the identication of named entities, locations and time expressions. After that, the model can automatically extract events which involving a named entity at certain time, location, and with event-related keywords based on the co-occurrence patterns of the event elements. Okamoto et al. <ref type="bibr" target="#b26">[27]</ref> presented a method for the detection of occasional or volatile local events using topic extraction technologies. They elaborate on a framework based on a two-level hierarchical clustering method. The resort to clustering techniques gave acceptable results with a good accuracy for event extraction. Liu et al. <ref type="bibr" target="#b4">[5]</ref> presented a framework for simultaneous key entities extraction and signicant events mining from daily web news based on clustering, modeling entities and weighted undirected bipartite graph. In the same led, the authors of <ref type="bibr" target="#b36">[37]</ref> developed a real-time news event extraction system based on automatic pattern learning from a small annotated corpus and in order to guarantee that massive amounts of textual data can be digested in real time, they have developed ExPRESS (Extraction Pattern Engine and Specication Suite), a highly ecient extraction pattern engine, which is capable of matching thousands of patterns within seconds. In <ref type="bibr" target="#b23">[24]</ref>, Lei et al. presented a framework for extracting and tracking topic relevant event based on SVM algorithm.</p><p>The use data-driven approaches for event extraction give a main advantage: there is no need to expert knowledge or linguistic resources. However, data-driven approaches require large text corpora in order to develop models that approximate linguistic phenomena. Another drawback is that data-driven methods do not deal with the meaning of text. To remedy this problem, researchers resort to knowledgedriven approaches which are based on patterns that express rules representing expert knowledge.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Knowledge-driven approaches for event extraction</head><p>Also known as Rule-Based methods, knowledge-driven methods are commonly based on patterns constructed by linguists. Patterns consist of lexically specied syntactic templates that are matched to text, in much the same way as regular expressions, which are applied along with type constraints on substrings of the match. These patterns are lexically indexed local grammar fragments, annotated with semantic relations between the various arguments and the knowledge representation <ref type="bibr" target="#b38">[39]</ref>. So, these rules or patterns are relying on linguistic knowledge about the structure of language and written in a formal notation so that they used by the computer for further parsing <ref type="bibr" target="#b24">[25]</ref>. The design of patterns (that may be lexico-syntactic or lexicosemantic pattern) and the choose of appropriate techniques are generally depends on many factors such as the language of the text that is to be processed and the nal purpose of processing. For the lexico-syntactic case, patterns combine lexical and syntactical information <ref type="bibr" target="#b21">[22]</ref> while for the case of lexico-semantic patterns are employed by the addition of semantic information generally through the use of gazetteers <ref type="bibr" target="#b18">[19]</ref> or ontologies <ref type="bibr" target="#b19">[20]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Lexico-syntactic patterns</head><p>As we mention before, lexico-syntactic patterns is a combinations between lexical representations ( i.e., strings) and syntactical information (e.g., Part-Of-Speech).</p><p>For further clarication, we present the following lexico-syntactic pattern given by Hearst in his work of 1998 <ref type="bibr" target="#b15">[16]</ref>: such NP as {NP,} * {(or | and)} NP Where he aimed to nd hyponym and hypernym relations by discovering regular expression patterns in free text. In this pattern, NP indicates a proper noun.</p><p>Other text (i.e., such, as, or, and and) is used for lexical matching, while ( and ) contain conjunction and disjunction statements to be evaluated, in this case a disjunction (denoted as |). Also, * is a repetition parameter that indicates the sequence between braces ( and ) is allowed to repeat zero to an innite number of times. Apply this lexico-syntactic pattern on this sentence . . . works by such authors as Herrick, Goldsmith, and Shakespeare gives the following results: hyponym("author", "Herrick") hyponym("author", "Goldsmith") hyponym("author", "Shakespeare") These patterns are often easy to comprehend by regular users, yet dening the right patterns to mine corpora to obtain unknown information is not a trivial task.</p><p>Hearst stresses that, in order to return desired results successfully, patterns should be dened in such a way that they occur frequently and in many text genres. Also, they should often indicate the relation of interest and should be recognizable with little or no pre-encoded knowledge. Furthermore, all existing syntactic variations have to be included into a complex pattern to ensure its proper working.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Lexico-semantic patterns</head><p>Lexico-semantic patterns are employed to remedy problems of the absence of concepts that have specic meaning (mean by the use of lexico-syntactic patterns).</p><p>In addition to the combination of lexical representations and syntactical information used by lexico-syntactic patterns, lexico-semantic patterns also permit for the usage of semantic information such as concepts that are dened in ontologies. So, Lexico-semantic patterns combine lexical representations with syntactic and semantic information. Lexico-semantic patterns are rst presented by <ref type="bibr" target="#b20">[21]</ref> in their work of 1991, where they made a system for text processing based on lexico-semantic patterns. These patterns could include terms and operators like lexical features, logical combinations, and repetition, which are mostly adopted from the regular expression language.</p><p>The following example is given by Wooter el al <ref type="bibr" target="#b6">[7]</ref> is a lexico-semantic pattern that will classify the verb phrase left dead as to express death or injury: ?PIVOT = (or found left shot) ?OBJ = * ?EFFECT=dead =&gt; (mark-activator murder d-vp) ;</p><p>This sentence would also match found dead and shot dead. Next to standard elements such as repetition and wildcards, the rule presented here contains features like variable assignment on the left-hand side (LHS) (where words preceded by ? denote variables) and on the right-hand side (RHS) macros such as mark-activator, which uses the results of the pattern match, including variable assignments, along with some other constants, such as murder and d-vp, to tag and segment the text. The use of lexico-semantic patterns gives many advantages, the most important is that they take into account the domain semantics which help the parser cope with the complexity and exibility of unstructured text written with natural language <ref type="bibr" target="#b15">[16]</ref>.</p><p>In the current body of literature, many works based on knowledge-driven approaches for event extraction exists. For instance, in their work of 2012, Wooter et al <ref type="bibr" target="#b18">[19]</ref> proposed a rule-based method to learn ontology instances from text, where they dened a lexico-semantic pattern language that, in addition to the lexical and syntactical information present in lexico-syntactic rules, also makes use of semantic information.</p><p>In <ref type="bibr" target="#b15">[16]</ref>, authors proposed the use of lexico-semantic patterns for extracting nancial events from RSS news feeds in order to allow investors on nancial markets to monitor nancial events when deciding on buying and selling equities. These patterns use nancial ontologies, leveraging the commonly used lexico-syntactic patterns to a higher abstraction level, thus enabling lexico-semantic patterns to recognize increasingly precisely events than lexico-syntactic patterns from text. For that, authors have developed rules based on lexico-semantic patterns used to nd events, and semantic actions that allow for updating the domain ontology with the eects of the discovered events. There, pattern creation was based on the triple paradigm (i.e., it makes use of a subject, a predicate, and an optional object), and that relies on triple conversion to the Java Annotations Pattern Engine 1 (JAPE) language <ref type="bibr" target="#b9">[10]</ref> and SPARQL 2 <ref type="bibr" target="#b1">[2]</ref>. Another work for economic event extraction is also presented for the same authors <ref type="bibr" target="#b17">[18]</ref>, in which they proposed a semantic-based information extraction pipeline for economic event detection, which makes use of lexico-semantic patterns that are dened in the JAPE language. Other works in the same eld could be found in <ref type="bibr" target="#b34">[35]</ref>, <ref type="bibr" target="#b35">[36]</ref>.</p><p>The resort to knowledge-driven approaches has alleviated many problems gured in case of data-driven approaches. The rst issue xed by the employ of knowledge-driven approaches is that we don't need to use a huge amount of training data (text corpora demanded by data-driven approaches) to develop models that approximate linguistic phenomena. The second important advantage is that the remedy to knowledge-driven approaches oers the possibility to rely on a combination of lexical, syntactical and semantic elements to dene powerful patterns which can be used to extract and recognize very specic information. Nevertheless, one common negative point concerns knowledge-driven approaches is that prior domain knowledge is required, so we need to ask for expert linguist help, in other words, , in order to be able to dene patterns that retrieve the correct, desired information, lexical knowledge and possibly also prior domain knowledge is required. Also, the resort only to knowledge-driven approaches may cause troubles and returns weak results especially when we need to recognize a big number of events.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">Hybrid approaches for event extraction</head><p>Staying within the limits of one type of event extraction approaches may not give the best results. So, combining data-driven approaches with knowledge-driven ones possibly will alleviate drawbacks of each kind and this actually creates a new kind of approaches: the hybrid approaches. In practice, it's dicult to rely only on one kind of event extraction approaches. Therefore, the majority of works in the recent literature relies on hybrid approaches. Generally, and during the application of hybrid approaches, data-driven approaches are generally used for the statistical processing (bootstrapping, POS tagging, initial clustering, etc) while knowledgedriven approaches are used for dening powerful expressions generally by means of lexical, syntactical and semantic elements <ref type="bibr" target="#b28">[29]</ref>. In other words, data-driven approaches used to deal with huge amount of data while knowledge-driven used to deal with specic meaning aims. Kenji et al. <ref type="bibr" target="#b31">[32]</ref> presented an approach to combine rule-based and data-driven NLP 1 https://gate.ac.uk/sale/tao/splitch8.html 2 http://www.w3.org/TR/rdf-sparql-query/ techniques in the extraction of grammatical relations. They have shown that starting with a rule-based system, we can use unlabeled data and a corpus-based system to improve recall (and F-score) of grammatical relations. In their work of 2004, Camiano et al. <ref type="bibr" target="#b8">[9]</ref> elaborated on a hybrid approach to resolve issues caused by the lack of expert knowledge, so they resort to statistical methods to remedy these issues. Pakhomov et al. <ref type="bibr" target="#b27">[28]</ref> combined statistical methods with lexical knowledge. A similar orientation could be found in <ref type="bibr" target="#b29">[30]</ref> in this case, authors used hybrid approach to reinforce statistical methods. The authors of <ref type="bibr" target="#b28">[29]</ref> bootstrap a weakly supervised pattern learning algorithm with clusters, in order to extract violence incidents from online news with high precision and recall, and storing these in knowledge bases.</p><p>The authors of <ref type="bibr" target="#b22">[23]</ref> employ a grammar-based statistical method to text mining, i.e., POS tagging. However, tagging is based on domain knowledge that is stored in ontologies, thus making the event extraction a hybrid process. Finally, Chun et al. <ref type="bibr" target="#b14">[15]</ref> extract events from biomedical literature by means of lexico-syntactic patterns, combined with term co-occurrences.</p><p>The combination of data-driven approaches with knowledge-driven ones bring several enhancements. For instance, and even still need a big amount of data to develop statistical models, the required amount of data in hybrid approaches is less than in the case of purely data-driven approaches. The same, the required amount of developed patterns by experts for detecting events is less than purely knowledgedriven approaches and this is due to the resort to statistical methods to discover events automatically. Drawbacks are generally caused by the complexity of hybrid systems which encompasses many techniques and methods of data-driven and dataknowledge approaches.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Discussion</head><p>In this section, we summarized the dierent discussed approaches and methods in a table <ref type="table" target="#tab_0">(Table 1</ref>), in which we tried to expose the main dierences between each approach. To do so, we listed, the dierent techniques used for each approach (Datadriven or knowledge driven approaches) then the used methods for each approach (hierarchical, graphs, SVM. . . ) and the dierent types of events. We presented, also, the amount of required data needed for each approach and nally the required domain knowledge and expertise). As shown in Table <ref type="table" target="#tab_0">1</ref>, in term of data usage, knowledge driven based approaches require fewer amounts of data. Experiments shows that we need only couple hundreds of documents or sentences to generate valuable and accurate event extraction rules. On the other hand, data-driven approaches require more than ten thousands documents to build useful statistical models that give acceptable results. For the hybrid approaches that combine datadriven and knowledge-driven methods, the amount of required data still elevated but it's much better than the case of Data-driven approaches, where we rely solely on statistical techniques to extract rules. For the interpretability, knowledge-driven approaches give the best results, especially for the case of lexico-semantic patterns that performs the high level of interpretability. The data-driven approaches give the lowest accurate. Based on the results given by this survey, and in order to chose the appropriate techniques and methods for event extraction, we recommend the resort to knowledge-driven approaches for specic domains, due the ease, the simplicity and the high accurate of rules based approaches. Also we need less amount of data to generate useful models. In the other hand, we recommend data-driven and hybrid approaches for users who deal with huge amount and variety of data to extract various types of events.  We present in this survey the main approaches in current literature, for event extraction from text. As shown, data-driven approaches (corpus based approaches) require a huge amount of data to discover statistical relations through the use of quantitative methods such as probabilistic modeling, information theory, and linear algebra to develop models that approximate linguistic phenomena, So these approaches require little domain knowledge and expertise. The main advantage of corpus based methods is that we don't need expert knowledge but we get low interpretability as a result. For the knowledge-driven approaches, we rely basically on patterns developed by experts but we need also a little amount of data to develop these patterns. Pattern based approaches gives better results with high interpretability but can't deal with huge amount of data when we are looking for the extraction of various types of events. The resort to hybrid approaches that combine knowledge-driven and data-driven seems to be a great solution to remedy drawbacks of each family approach and get the advantages of both techniques: patterns based and corpus based methods.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head></head><label></label><figDesc>The eight event types (with subtypes in parentheses) are Life (Be-Born, Marry, Divorce, Injure, Die), Movement (Transport), Transaction (Transfer-Ownership, Transfer-Money), Business (Start-Org, Merge-Org, Declare-Bankruptcy, EndOrg), Conict (Attack, Demonstrate), Contact (Meet, Phone-Write), Personnel (Start-Position, End-Position, Nominate, Elect), Justice (ArrestJail, Release-Parole, Trial-Hearing,</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>A</head><label></label><figDesc>comparison between the 3 event extraction categories in terms of: amount of necessary data, demanded knowledge and expertise</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 .</head><label>1</label><figDesc></figDesc><table /></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">ACE (Automatic Content Extraction) English Annotation Guidelines for Events</title>
		<imprint>
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">SPARQL query language for RDF, W3C recommendation</title>
		<imprint>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">C H V</forename><surname>Aaltonen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Heinze</surname></persName>
		</author>
		<title level="m">18th UKAIS Annual Conference: Social Information Systems</title>
				<meeting><address><addrLine>Oxford, UK</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
		<respStmt>
			<orgName>Worcester College</orgName>
		</respStmt>
	</monogr>
	<note>Social media in europe: Lessons from an online survey</note>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m" type="main">A survey of data mining techniques for social media analysis</title>
		<author>
			<persName><forename type="first">M</forename><surname>Adedoyin-Olowe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">M</forename><surname>Gaber</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">T</forename><surname>Stahl</surname></persName>
		</author>
		<idno>CoRR, abs/1312.4617</idno>
		<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Extracting city trac events from social streams</title>
		<author>
			<persName><forename type="first">P</forename><surname>Anantharam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Barnaghi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Thirunarayan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sheth</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Trans. Intell. Syst. Technol</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page">27</biblScope>
			<date type="published" when="2015-07">July 2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Social media: Friend or foe of natural language processing?</title>
		<author>
			<persName><forename type="first">T</forename><surname>Baldwin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 26th Pacic Asia Conference on Language, Information, and Computation</title>
				<meeting>the 26th Pacic Asia Conference on Language, Information, and Computation<address><addrLine>Bali,Indonesia</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2012-11">November 2012</date>
			<biblScope unit="page">5859</biblScope>
		</imprint>
		<respStmt>
			<orgName>Faculty of Computer Science, Universitas Indonesia</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Semi-automatic nancial events discovery based on lexico-semantic patterns</title>
		<author>
			<persName><forename type="first">J</forename><surname>Borsje</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Hogenboom</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Frasincar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Int. J. Web Eng. Technol</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page">115140</biblScope>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Exploiting context analysis for combining multiple entity resolution systems</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">V</forename><surname>Kalashnikov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mehrotra</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2009 ACM SIGMOD International Conference on Management of data</title>
				<meeting>the 2009 ACM SIGMOD International Conference on Management of data</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2009">2009</date>
			<biblScope unit="page">207218</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Learning by googling</title>
		<author>
			<persName><forename type="first">P</forename><surname>Cimiano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Staab</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">SIGKDD Explor. Newsl</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page">2433</biblScope>
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">JAPE: a Java Annotation Patterns Engine</title>
		<author>
			<persName><forename type="first">H</forename><surname>Cunningham</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Maynard</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Tablan</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2000-11">November 2000</date>
		</imprint>
		<respStmt>
			<orgName>Department of Computer Science, University of Sheeld</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Research Memorandum CS0010</note>
	<note>Second Edition</note>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Data-driven part-of-speech tagging of kiswahili</title>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">De</forename><surname>Pauw</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G.-M</forename><surname>De Schryver</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Wagacha</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Text, Speech and Dialogue</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<meeting><address><addrLine>Berlin Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2006">2006</date>
			<biblScope unit="volume">4188</biblScope>
			<biblScope unit="page">197204</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Icelandic data driven part of speech tagging</title>
		<author>
			<persName><forename type="first">M</forename><surname>Dredze</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wallenberg</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics</title>
		<title level="s">Short Papers</title>
		<meeting>the 46th Annual Meeting of the Association for Computational Linguistics<address><addrLine>Columbus, Ohio, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2008">June 15-20, 2008. 2008</date>
			<biblScope unit="page">3336</biblScope>
		</imprint>
	</monogr>
	<note>ACL 2008</note>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<author>
			<persName><forename type="first">H</forename><surname>Farah</surname></persName>
		</author>
		<title level="m">Extraction de concepts et de relations entre concepts Ã partir des documents multilingues : Approche statistique et ontologique dissertation</title>
				<meeting><address><addrLine>Lyon, France</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
		<respStmt>
			<orgName>Institut Nationale des Sciences AppliquÃ c es de Lyon</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">PhD thesis</note>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">A semantic web-based approach for building personalized news services</title>
		<author>
			<persName><forename type="first">B.-J.</forename><forename type="middle">L L</forename><surname>Frasincar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal of E-Business Research (IJEBR)</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="page">3</biblScope>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Information extraction: Capabilities and challenges</title>
		<author>
			<persName><forename type="first">R</forename><surname>Grishman</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="s">Lecture Notes</title>
		<imprint>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Automated discovery of wordnet relations</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Hearst</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">WordNet: an electronic lexical database</title>
				<imprint>
			<date type="published" when="1998">1998</date>
			<biblScope unit="page">131153</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">An overview of event extraction from text</title>
		<author>
			<persName><forename type="first">F</forename><surname>Hogenboom</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Frasincar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">U</forename><surname>Kaymak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">D</forename><surname>Jong</surname></persName>
		</author>
		<ptr target="org" />
	</analytic>
	<monogr>
		<title level="m">Workshop on Detection, Representation, and Exploitation of Events in the Semantic Web (DeRiVE 2011) at Tenth International Semantic Web Conference (ISWC 2011)</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<imprint>
			<publisher>CEURWS</publisher>
			<date type="published" when="2011">2011. 2011</date>
			<biblScope unit="volume">779</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Speed: A semantics-based pipeline for economic event detection</title>
		<author>
			<persName><forename type="first">F</forename><surname>Hogenboom</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Hogenboom</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Frasincar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">U</forename><surname>Kaymak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Van Der Meer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Schouten</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Vandic</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">editors, Conceptual Modeling ER</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<editor>
			<persName><forename type="first">J</forename><surname>Parsons</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Saeki</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Shoval</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Woo</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Y</forename><surname>Wand</surname></persName>
		</editor>
		<meeting><address><addrLine>Berlin Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2010">2010. 2010</date>
			<biblScope unit="volume">6412</biblScope>
			<biblScope unit="page">452457</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">A lexico-semantic pattern language for learning ontology instances from text</title>
		<author>
			<persName><forename type="first">W</forename><surname>Ijntema</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Sangers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Hogenboom</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Frasincar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Web Semantics: Science, Services and Agents on the World Wide Web</title>
		<imprint>
			<biblScope unit="volume">15</biblScope>
			<biblScope unit="issue">3</biblScope>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">A lexico-semantic pattern language for learning ontology instances from text</title>
		<author>
			<persName><forename type="first">W</forename><surname>Ijntema</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Sangers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Hogenboom</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Frasincar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">J. Web Sem</title>
		<imprint>
			<biblScope unit="volume">15</biblScope>
			<biblScope unit="page" from="37" to="50" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Lexico-semantic pattern matching as a companion to parsing in text understanding</title>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">S</forename><surname>Jacobs</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">R</forename><surname>Krupka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">F</forename><surname>Rau</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Workshop on Speech and Natural Language</title>
				<meeting>the Workshop on Speech and Natural Language<address><addrLine>Stroudsburg, PA, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="1991">1991</date>
			<biblScope unit="page">337341</biblScope>
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Lexico-syntactic patterns for automatic ontology building</title>
		<author>
			<persName><forename type="first">C</forename><surname>Klaussner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Zhekova</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Second Student Research Workshop associated with RANLP 2011</title>
				<meeting>the Second Student Research Workshop associated with RANLP 2011<address><addrLine>Hissar, Bulgaria</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2011-09">September 2011</date>
			<biblScope unit="page">109114</biblScope>
		</imprint>
	</monogr>
	<note>RANLP 2011 Organising Committee</note>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Ontology-based fuzzy event extraction agent for chinese e-news summarization</title>
		<author>
			<persName><forename type="first">C.-S</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y.-J</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z.-W</forename><surname>Jian</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Expert Syst. Appl</title>
		<imprint>
			<biblScope unit="volume">25</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page">431447</biblScope>
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">A system for detecting and tracking internet news event</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Lei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L.-D</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y.-C</forename><surname>Liu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="s">Lecture Notes in Computer Science</title>
		<editor>Y.-S. Ho and H. J. Kim</editor>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page">754764</biblScope>
			<date type="published" when="2005">2005</date>
			<publisher>Springer</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<monogr>
		<title level="m" type="main">Data-Driven syntactic analysis methods and applications for Swedish</title>
		<author>
			<persName><forename type="first">B</forename><surname>Megyesi</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2002">2002</date>
		</imprint>
		<respStmt>
			<orgName>Departement of Speech, Music and Hearing KTH, Kungliga Tekniska Hogskolan</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Doctoral dissertation</note>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">The benets of facebook friends: Social capital and college students use of online social network sites</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">S</forename><surname>Nicole</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Ellison</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Lampe</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computer Mediated Communication</title>
		<imprint>
			<biblScope unit="volume">12</biblScope>
			<date type="published" when="2007-07">July 2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Discovering volatile events in your neighborhood: Local-area topic extraction from blog entries</title>
		<author>
			<persName><forename type="first">M</forename><surname>Okamoto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kikuchi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">AIRS</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<editor>
			<persName><forename type="first">G</forename><forename type="middle">G</forename><surname>Lee</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">D</forename><surname>Song</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C.-Y</forename><surname>Lin</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><forename type="middle">N</forename><surname>Aizawa</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K</forename><surname>Kuriyama</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Yoshioka</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>Sakai</surname></persName>
		</editor>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2009">2009</date>
			<biblScope unit="volume">5839</biblScope>
			<biblScope unit="page">181192</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">Semi-supervised maximum entropy based approach to acronym and abbreviation normalization in medical texts</title>
		<author>
			<persName><forename type="first">S</forename><surname>Pakhomov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 40th Annual Meeting on Association for Computational Linguistics</title>
				<meeting>the 40th Annual Meeting on Association for Computational Linguistics<address><addrLine>Stroudsburg, PA, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2002">2002</date>
			<biblScope unit="page">160167</biblScope>
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Extracting violent events from on-line news for ontology population</title>
		<author>
			<persName><forename type="first">J</forename><surname>Piskorski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Tanev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">O</forename><surname>Wennerberg</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="s">Lecture Notes in Computer Science</title>
		<editor>W. Abramowicz</editor>
		<imprint>
			<biblScope unit="volume">4439</biblScope>
			<biblScope unit="page">287300</biblScope>
			<date type="published" when="2007">2007</date>
			<publisher>Springer</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">The importance of syntactic parsing and inference in semantic role labeling</title>
		<author>
			<persName><forename type="first">V</forename><surname>Punyakanok</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Roth</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W.-T</forename><surname>Yih</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Comput. Linguist</title>
		<imprint>
			<biblScope unit="volume">34</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page">257287</biblScope>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<analytic>
		<title level="a" type="main">Unsupervised techniques for extracting and clustering complex events in news</title>
		<author>
			<persName><forename type="first">D</forename><surname>Rusu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hodson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kimball</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Second Workshop on EVENTS: Denition, Detection, Coreference, and Representation</title>
				<meeting>the Second Workshop on EVENTS: Denition, Detection, Coreference, and Representation<address><addrLine>Baltimore, Maryland, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2014-06">June 2014</date>
			<biblScope unit="page">2634</biblScope>
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b31">
	<analytic>
		<title level="a" type="main">Combining rule-based and datadriven techniques for grammatical relation extraction in spoken langugage</title>
		<author>
			<persName><forename type="first">K</forename><surname>Sagae</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Lavie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Macwhinney</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Eighth International Workshop in Parsing</title>
				<meeting>the Eighth International Workshop in Parsing</meeting>
		<imprint>
			<date type="published" when="2003">2003</date>
			<biblScope unit="page">153162</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<analytic>
		<title level="a" type="main">A semantic extension for event modelisation</title>
		<author>
			<persName><forename type="first">A</forename><surname>Saval</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Bouzid</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Brunessaux</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ICTAI &apos;09. 21st International Conference on</title>
				<imprint>
			<date type="published" when="2009-11">2009. Nov 2009</date>
			<biblScope unit="page">139146</biblScope>
		</imprint>
	</monogr>
	<note>Tools with Articial Intelligence</note>
</biblStruct>

<biblStruct xml:id="b33">
	<monogr>
		<title level="m" type="main">Système informatique de capitalisation de connaissances et d&apos;innovation pour la conception et le pilotage de systèmes de culture durables</title>
		<author>
			<persName><forename type="first">V</forename><surname>Soulignac</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2012-10">Oct. 2012</date>
		</imprint>
		<respStmt>
			<orgName>Université Blaise Pascal -Clermont-Ferrand II</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Theses</note>
</biblStruct>

<biblStruct xml:id="b34">
	<monogr>
		<title level="m" type="main">Engineering Ontologies using Semantic Patterns</title>
		<author>
			<persName><forename type="first">S</forename><surname>Staab</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Erdmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Maedche</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2001">2001</date>
			<pubPlace>Seattle</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b35">
	<monogr>
		<title level="m" type="main">Engineering ontologies using semantic patterns</title>
		<author>
			<persName><forename type="first">S</forename><surname>Staab</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Erdmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Maedche</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b36">
	<analytic>
		<title level="a" type="main">Real-time news event extraction for global crisis monitoring</title>
		<author>
			<persName><forename type="first">H</forename><surname>Tanev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Piskorski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Atkinson</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 13th International Conference on Natural Language and Information Systems: Applications of Natural Language to Information Systems</title>
				<meeting>the 13th International Conference on Natural Language and Information Systems: Applications of Natural Language to Information Systems<address><addrLine>Berlin, Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer-Verlag</publisher>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page">207218</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b37">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><surname>Walker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Strassel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Medero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Maeda</surname></persName>
		</author>
		<title level="m">Ace 2005 Multilingual Training Corpus. Linguistic Data Consortium</title>
				<meeting><address><addrLine>Philadelphia</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b38">
	<monogr>
		<title level="m" type="main">Structural methods for lexical/semantic patterns</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Waterman</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1993">1993</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b39">
	<monogr>
		<title level="m" type="main">Weka: Practical machine learning tools and techniques with java implementations</title>
		<author>
			<persName><forename type="first">I</forename><forename type="middle">H</forename><surname>Witten</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Frank</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Trigg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hall</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Holmes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">J</forename><surname>Cunningham</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1999">1999</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b40">
	<analytic>
		<title level="a" type="main">A simple bayesian modelling approach to event extraction from twitter</title>
		<author>
			<persName><forename type="first">D</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>He</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics</title>
				<meeting>the 52nd Annual Meeting of the Association for Computational Linguistics<address><addrLine>Baltimore, Maryland</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2014-06">June 2014</date>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page">700705</biblScope>
		</imprint>
	</monogr>
	<note>Short Papers)</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
