<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Toward Real Event Detection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Michael Farber?</string-name>
          <email>michael.faerber@kit.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Achim Rettinger</string-name>
          <email>rettinger@kit.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Karlsruhe Institute of Technology (KIT), Institute AIFB</institution>
          ,
          <addr-line>Karlsruhe</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>News agencies and other news providers or consumers are confronted with the task of extracting events from news articles. This is done i) either to monitor and, hence, to be informed about events of speci c kinds over time and/or ii) to react to events immediately. In the past, several promising approaches to extracting events from text have been proposed. Besides purely statistically-based approaches there are methods to represent events in a semantically-structured form, such as graphs containing actions (predicates), participants (entities), etc. However, it turns out to be very di cult to automatically determine whether an event is real or not. In this paper, we give an overview of approaches which proposed solutions for this research problem. We show that there is no gold standard dataset where real events are annotated in text documents in a ne-grained, semantically-enriched way. We present a methodology of creating such a dataset with the help of crowdsourcing and present preliminary results.</p>
      </abstract>
      <kwd-group>
        <kwd>Event Detection</kwd>
        <kwd>Information Extraction</kwd>
        <kwd>Factuality</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>News agencies and other digital media publishers publish each day news articles
in the magnitude of dozens of thousands. They also process the news for further
business tasks such as trend prediction and market change detection. This is
still mainly done manually today. Even if knowledge workers at news agencies
have access to all this information, it is infeasible for them to read all the news
and to determine, whether the articles contain information which is not only
interesting for people in their domains, but which contain real events and, hence,
have a signi cant, immediate impact on business such as nancial operations
(shares) and political happenings. Consider for example the rst sentence of a
news article:
\Apple may acquire Beats Electronics next week."
(1)
? This work was carried out with the support of the German Federal Ministry of
Education and Research (BMBF) within the Software Campus project SUITE
(Grant 01IS12051).</p>
      <p>Here, it remains unclear whether Apple is really going to acquire Beats (and does
not cancel it in the last minute) or whether this is just a rumor. The sentence
\Apple con rmed that it acquired Beats Electronics on Wednesday."
(2)
in contrary, reveals that the acquisition already happened (besides the
con rmation which is an event per se). This demonstrates the di erentiating
characteristic between real events and events in general. As humans we can
estimate that the rst article is not a trigger for immediate shifts in the stock
market (besides psychological e ects), but maybe the second mentioned article.
Machines, in contrast, have their di culties in distinguishing real events from
other events.</p>
      <p>We envision building a decision support tool for agents like stockbrokers.
The aim of the system is to inform the user quickly and automatically when
some detected event has really happened and hence might in uence the invested
assets of the user. The user should also have the possibility to store purely real
events in his database. For such purposes, an event extraction system would
consist of two steps: i) It extracts events in a structured, semantically enriched
representation and ii) determines based on linguistic cues whether the event is
real or not.</p>
      <p>Research on real event detection has been very limited so far. In this paper, we
present an approach to de ne events and real events in a setting as described.
Since no suitable gold standard for evaluating a real event detection system
exists, we present our setting of creating one using crowdsourcing. Preliminary
results regarding this gold standard are presented, as well as challenges which
we came across.</p>
      <p>The remainder of this paper is organized as follows: First we present
de nitions of event detection in Section 2, before considering de nitions of
real event detection in Section 3. After discussing our setup of creating a gold
standard for real event detection in Section 4, we conclude in Section 5.
2
2.1</p>
    </sec>
    <sec id="sec-2">
      <title>General Event De nitions</title>
      <sec id="sec-2-1">
        <title>Event De nitions in Use</title>
        <p>We can distinguish between the following classes of event representation (see
also Fig. 1 for examples):
1. Something happened : In this event representation, events are only roughly
covered. There are no types and deeper meanings gathered, only what topic
the document/sentence is about. This topic is often characterized by the
words occurring in the document (bag-of-words model) and/or by the set of
recognized named entities.
2. This happened : For this representation, the event type of the event is
detected. The event type can be quite generic such as earthquake. The
number of events which can be detected is often very limited. Events may</p>
        <p>Wednesday</p>
        <p>Beats
acquired Electronics</p>
        <p>Apple</p>
        <p>confirmed
(a) Event Representation Class 1
"Wednesday"</p>
        <p>Event type
Participant
Participant
(b) Event Representation Class 2
:time
:subevent
:acquire</p>
        <p>:agent
:confirm :agent</p>
        <p>:Beats Electronics
:patient</p>
        <p>:Apple Inc.</p>
        <p>(c) Event Representation Class 3
have attributes or slots which are pre-de ned for the single event types.
Instead of prede ned entity types such as earthquake or accident sometimes
only the entity types Per, Loc, Org, and Misc are used.
3. This happened to these objects in this way : If we use this representation
format, we have a deeper understanding in the actual event. Events of
this class are quite speci c and include not only speci c actions, but also
participants, and maybe time, place, and manner of the action. Often
linguistic theories such as Semantic Role Labeling provide the basis for event
representations of this class.</p>
        <p>
          Related work using event de nitions corresponding to the rst event
representation class do not de ne events at all [
          <xref ref-type="bibr" rid="ref1 ref2 ref3 ref4 ref5">1,2,3,4,5</xref>
          ]. This is due to the
fact that here it must be only known that something happened (something that
is, for instance, di erent to what has been seen so far), but not what. Events do
not need to be represented on its own; instead, events are indirectly represented
by the document in which they are expressed. Documents are compared against
each other, either by using the bag-of-words model [
          <xref ref-type="bibr" rid="ref2 ref3 ref4">2,3,4</xref>
          ] or in addition by
taking detected named entities (with the classical entity types PER, LOC,
ORG, MISC) into account [
          <xref ref-type="bibr" rid="ref1 ref5">1,5</xref>
          ].
        </p>
        <p>
          Approaches using the second event de nition have in common that
coarse-grained events such as accidents or earthquakes are represented. Each
event has therefore an event type. Property-value-pairs can be assigned to the
events, whereas the assignable properties are pre-de ned for all event types.
Often templates are used for storing the information about events [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
        </p>
        <p>
          In case of event representations of the third kind, structural representations
of ne-grained events are extracted from text { here, typically from single
sentences or clauses. Research based on this event class usually does not
introduce a new de nition of events, but instead either uses linguistic de nitions
of events where events consist of happenings with agents, locations, time, etc.
[
          <xref ref-type="bibr" rid="ref7 ref8 ref9">7,8,9</xref>
          ] or abstracts from it to a certain, but limited extend [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. Bejan [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]
characterizes an event as a happening at a given location and in a speci c time
interval. Each event has semantic relations to agents, to a location, time, etc.
as parts of the event. These are the semantic/thematic roles of an event in the
linguistic understanding. Events can contain several sub-events. Events of an
event scenario (as higher-order structure) are connected by event relations. An
example is the cause relation where one event causes another event. Xie et al. [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]
propose two approaches which are based on Semantic Frames { constructed by
the tool SEMAFOR. Also, Wang et al. [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] use semantic parsing which is based
on PropBank in order to represent events. Yeh et al. [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] regard events as similar
to frames in FrameNet. Each event encodes knowledge about the participants,
where (and when) the event occurred and the events which are caused by this
event. A buy event, for instance, is about the object bought, the donor, and the
recipient.
2.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Event De nition</title>
        <p>In this paper we focus on the detection and semantically-structured
representation of real events of the third-mentioned event class, which is the
most expensive one. More speci cally, an event in our scenario is characterized
by
{ speci c participants (agents or objects)
{ situations (events or states) which are described within the event
{ taking place at a speci c place and/or time
{ being not a state.</p>
        <p>States are hereby de ned as lasting for an inde nite period of time and which
are not really observable. Given the example sentence 2 in Section 1 we can
extract two events from it: i) The event that Apple con rmed something (which
is an event itself) and ii) the event that Apple acquired Beats Electronics.</p>
        <p>Fig. 1c shows how these events can be represented as a
semantically-structured graph. Hereby, Event ii) can either be part of
Event i) (as depicted in the gure) or be stored as a separate graph. Nodes in
each event graph can be either predicate nodes (representing actions), entity
nodes (representing participants), or literal nodes (representing the time, etc.).
Predicate and entity nodes can be linked to entries in knowledge bases such
as DBpedia (for entities) and WordNet (for predicates). This enables having
unique identi ers for resources and to resolve ambiguities. The edges in these
event graphs arise from the semantic roles assigned by a Semantic Role Labeling
tool. In the depicted gure, the semantic roles are grounded as RDF predicates.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Real Event Detection</title>
      <sec id="sec-3-1">
        <title>De nitions of Real Events</title>
        <p>We de ne real event detection as the task of determining whether a given event
expressed in text is real. Real events are events according to the de nition in
Section 2.2 and have already happened or are happening. Thus, the de nition of
events is extended by this aspect. We can split the task of real event detection
therefore into two subtasks: 1. Determining if the situation described in the text
is about an event according to our de nition. 2. Determining if the event already
happened or is currently happening.</p>
        <p>
          Regarding the rst subtask, we can refer to two areas of linguistic work:
i) The distinction of events from states, and ii) the identi cation of factuality
of events. In the following, we amplify these two areas with respect to our goal
of real event detection. We hereby use the term situation as a generic concept
which encompasses both event and state (cf. [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]).
        </p>
        <p>
          Ad i) The classi cation of situations can be traced back to Aristotle who
distinguished between verbs that have a de ned end or result, and others that
do not [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. Vendler [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] distinguished situations into four aspectual classes (also
called aktionsarten) and performed empirical experiments. The aspectual classes
are based on the temporal structure of events. These classes are namely: state,
activity, accomplishment, and achievement. A state is something in which an
entity remains for a longer, often unspeci ed period of time (e.g., \Jack knows the
answer"). The three other classes in the aspectual classi cation cover di erent
types of events in the narrower sense. An event is characterized as something
which happens or occurs in a de nite time interval or at a speci c point in time.
It often comes along with predicates such as \write", \push", etc. An event
usually causes some state change.
        </p>
        <p>
          To determine which aspectual class a
given situation belongs to, we can di er
between telic, dynamic, and durative Table 1: Vendler's four-way
situations (see Table 1). Telic situations distinction between verbs based
always have a culmination point beyond on their aspectual features [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ].
which the situation cannot continue.
        </p>
        <p>Dynamic situations consist of internal
sub-events which change over time and
are, hence, intrinsically heterogeneous.</p>
        <p>For instance, walking consists of several
alternating subevents. Durative situations
(e.g., eating) last for a speci able period
in time and are not punctual.</p>
        <p>
          In our case we want to distinguish events from states. But how can we
determine which aspectual class holds for a given situation? For Vendler [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]
and others who worked on top of his theories it became apparent that it is not
trivial to determine the class automatically. See [
          <xref ref-type="bibr" rid="ref11 ref13 ref14 ref15 ref16">13,11,14,15,16</xref>
          ] for more details
on linguistic rules for that purpose.
        </p>
        <p>Class Telic Dynamic Durative
state - - X
activity - X X
accomplishment X X X
achievement X X</p>
        <p>
          Moens and Steedman [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] propose another classi cation of situations. Here,
situations are also either states or events. Events are sub-classi ed by two
dimensions: 1. Events are either atomic or durative events. 2. Entities of events
are in a consequent state or not. We refer to [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] for more information.
        </p>
        <p>
          Ad ii) Other researchers have focused on determining the factuality of events,
i.e. to recognize whether events are presented in the sentences as corresponding
to real situations in the world, as situations that have not happened, or as
situations of uncertain status. The focus is, hence, the trustfulness of events in
text. Factuality can be characterized by two dimensions: Polarity and epistemic
modality. Polarity { more concrete: polarity on actuality and not subjective
polarity { is a discrete category and can be either positive or negative. Epistemic
modality, in contrast, expresses the speaker's degree of commitment to the
truth of a proposition [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. It ranges from uncertain (also called \possible") to
absolutely certain (also called \necessary"). According to Horn [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ], modality
is a continuous category. Sauri [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] spans the factuality values space from
positive, negative, to unknown for the polarity dimension, and certain, probable,
possible, to unknown for the modality dimension. Unknown is true for cases of
uncommitment. In this way, a tuple of polarity value and epistemic modality
value states the factuality of the event.
        </p>
        <p>How is factuality expressed in the text? This is done by lexical markers as well
as syntactic markers. Lexical modal markers are modal auxiliaries (e.g., \could",
\may", \must"), as well as clausal/sentential adverbial modi ers (e.g., \maybe",
\likely", \possibly"). Examples of lexical polarity markers are adverbs (e.g.,
\not", \until"), quanti ers (e.g., \no", \none"), and pronouns (e.g., \nobody").
Syntactic constructs are necessary to consider since often one clause is embedded
in another. Considerable are in this context especially relative clauses and
that-clauses as in the example sentences.</p>
        <p>
          What are the challenges to determine the factuality? Factuality markers
interact with each other. The local modality and polarity operators (e.g., of
the current clause) are therefore not enough. Instead, a global consideration is
necessary. For instance, in case of that-clauses, the factuality of the inner event
is dependent on the factuality of the outer event. Furthermore, what makes the
factuality much more complex is the fact that the source of an event is often not
only the author. These additional sources are introduced by means of predicates
of reporting (such as \say" or \tell"), knowledge and opinion (such as \believe",
\know"), psychological reaction (such as \regret"), etc. Sauri and Pustejovsky
[
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] calls these predicates due to their role Source Introducing Predicates (SIPs).
The di culty is that the status of the other sources often di ers from the author.
The reader does not have direct access to the factual assessment of these other
sources. In the sentence, \The Guardian wrote that the G-7 leaders pretended
everything was OK in Russia's economy.", the reader cannot assess directly the
\frame of mind" of The Guardian with respect to the factuality of the event
of \pretended". However, the factuality assessment has to be relative to the
relevant sources.
3.2
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Requirements of a Gold Standard for Real Event Detection</title>
        <p>According to our event de nition in Section 2.2 and the additional aspect of
factuality addressed in Section 3.1 we can list the following requirements a gold
standard dataset for the evaluation of a real event detection system must ful ll:
1. Each mention of an action within an event (e.g., \wrote") is annotated.
2. There is a distinction between events and states, so that all events in the
strict sense are annotated.
3. There is no restriction to speci c event types.
4. The factuality of the event is annotated (being positive or negative).
5. All participants and participating objects are annotated.
6. All participants and participating objects are linked to prevalent knowledge
bases.
7. Subevents of events are annotated and linked.
8. Mentions of place and time of each event are annotated.</p>
        <p>This gold standard is also suitable when it comes to extracting real events
according to the Event Representation Classes 1 and 2 (see Section 2.1). In
these cases, the information about the structural representation of events can
be neglected. Additional ltering can achieve that only events of speci c types
such as accidents are detected.
3.3</p>
      </sec>
      <sec id="sec-3-3">
        <title>Datasets for Real Event Detection</title>
        <p>In the following, we review existing corpora where event factuality was annotated
to some degree.</p>
        <p>
          The Multi-Perspective Question Answering (MPQA) corpus [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ] provides
news articles annotated for opinions and other private states such as beliefs
or thoughts. It was designed for subjectivity and sentiment research and does
not provide any structured representation of (real) events. At most, it might be
applicable as negative corpus in a scenario where situations written in text are
approved to be not real events.
        </p>
        <p>
          The Penn Discourse TreeBank (PDTB) [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] is a corpus where discourse
connectives are annotated along with their arguments (e.g. $arg1 \{ even
though" $arg2). On top of the original annotation scheme, an extended
annotation scheme was released for marking the attribution of abstract objects
such as propositions, facts and eventualities associated with discourse relations
and their arguments annotated in the PDTB. The events described in the
arguments are, however, not transformed into a structured event representation.
        </p>
        <p>
          TimeBank 1.2 [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ] is a corpus which was annotated with TimeML [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ].
TimeML is a language for representing temporal and event information.
TimeBank is suitable for event factuality learning since it uses grammar markers
as well as annotations of predicates. Events are classi ed into occurrence,
state, reporting, immediate-action, immediate-state, aspectual, and perception.
TimeBank does not contain a structured event representation where all
participating objects are annotated. In addition, the event de nition is somehow
di erent to our proposed de nition: A huge fraction (25,7%) of phrases annotated
as events are not verbs, but nouns, adjectives, etc. Not all phrases that should
be regarded as event predicates are annotated.
        </p>
        <p>
          FactBank [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] is a corpus which was built on top of TimeBank and a subset
of the documents in the AQUAINT TimeML Corpus (A-TimeML Corpus). It
comes along with annotations of explicitly factual information about events.
FactBank has the same obstacles as TimeBank.
        </p>
        <p>
          ACE 2005 [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ] from the Automatic Content Extraction (ACE) technology
evaluation is a dataset dedicated to the detection of events in text. The task
was limited to the detection of speci c event types which are: Life, Movement,
Transaction, Business, Con ict, Contact, Personnel, and Justice. Each type has
one to 13 subtypes so that each event is assigned to one main event type and
one subtype of it. The limitation to these event types is the main obstacle why
ACE 2005 cannot be used in our setting directly. Four attributes are attached to
each annotated event: Modality, Polarity, Genericity, and Tense. In accordance
with the event type, speci c slots (argument roles called here; such as entities,
values, and times) can be assigned. ACE entities are categorized in speci c classes
(namely, Person, Organization, Location, Geo-political entity, Facility, Vehicle,
and Weapon) and their subclasses, but are not linked to any knowledge base.
        </p>
        <p>In summary, we can state that none of the mentioned corpora contains
semantically-structured representations of events to the extent it is needed to
evaluate a real event detection system where events are de ned as in Section 2.2.
Thus, in the following section we provide experiments on how to build a gold
standard which ful lls all our requirements.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experiments for Building a Gold Standard Dataset</title>
      <p>Very rst crowdsourcing experiments revealed that letting users annotate real
events as described in Section 3.2 at once is too complex for any crowdsourcing
job. Therefore, we arranged subtasks where the following questions are answered
separately for each event:
1. Which are the actions/predicates inducing a real event?
2. Which are the participating objects?
3. What is the time and place?
4. Which sub-events are contained?
In the following we present our approach regarding the rst subtask, namely
identifying real events and naming the central predicates of them. We performed
two crowdsourcing jobs which di er in their methodology.1</p>
      <p>Run 1 The crowd was asked to read a given sentence, to look for real events
(as de ned above), and to enter the action verbs of these events as written in
the sentence.
1 The crowdsourcing job descriptions and evaluation data is available online at http:
//www.aifb.kit.edu/web/Toward_Real_Event_Detection
Run 1: "Find real actions"
187 sentences, 8 test questions, 12¢ per task,
5 users per judgment</p>
      <p>Our gold
standard:
205 verbs inducing
real events
224 verbs judged by
crowd as inducing
real events
152/224 (67.9%) of
verbs judged as
inducing real events
are correct</p>
      <p>Run 2: "Find observable and non-observable predicates"
187 sentences, 9 test questions, 12¢per task,
5 users per judgment</p>
      <p>Our gold standard:
205 action verbs
354 observable 185 non-observable
predicates predicates
133/205 (64.9%) of
predicates judged as
observable are corecct
285/334 (85.3%) of predicates
judged as non-observable
are correct
205 verbs classified
by crowd as observable
334 predicates classified by crowd
as non-observable</p>
      <p>Run 2 For this second run, the crowd was asked to read each given sentence,
look for all verbs, and categorize them into either observable or not-observable.</p>
      <p>Observable events/facts were de ned as follows:2 An observable fact can
be an occurrence (e.g., "arrive\, "destroy\), a reporting (e.g., "report\), or
an immediate action (e.g., "approve\). Observable facts are characterized by
the fact that they could be observed or con rmed by third persons directly
(e.g., in case of "say\) or indirectly (e.g., in case of "con rm\). Non-observable
facts describe states which characterize persons or objects, but which are not
observable by other persons than the persons involved. Such non-observable
facts are states which last for an inde nite/unspeci ed period of time (e.g., "be
happy\), immediate states (e.g., "believe\, "worried\), aspects (e.g., "start\,
"continue\), or perceptions (e.g., "feel\). The categorization into observable
vs. non-observable facts is here done independently of the fact whether the
event has happened (or the state is) for sure or not. The categorization into
the past/presence or future is performed in a separate crowdsourcing task.</p>
      <p>As dataset we used all rst sentences of news articles which were published on
one day (2014/05/28) by the news agency Bloomberg and where the news articles
contained some information about Apple Inc. In total we manually annotated 187
sentences to assess the performance of our crowdsourcing tasks. Crowd sourcing
was performed on the platform Crowd ower.3 In Run 1 (Run 2), users had to
answer 8 (9) quiz test questions before entering the actual task. In both runs,
users got 12 cent per task consisting of 4 questions each. For each question we
gained results from 5 users and took the answers where there was an inter-rater
agreement of at least 50%.</p>
      <p>The results of our crowdsourcing annotation experiments are summarized
in Fig. 2. It became apparent that completing the crowdsourcing tasks requires
high cognitive e orts in comparison to other crowdsourcing tasks. A considerable
amount of users did not pass the test questions at the beginning. Even if we
2 The de nition is based on the TimeBank annotation guidelines.
3 http://crowdflower.com
admit only users who worked on our job in the past su ciently well, creating a
big annotated corpus is tricky. As Run 2 shows, already the distinction between
observable events, i.e. events showing up in the real world, and not-observable
events is hard to perform. Although we put much e ort in re ning the task
descriptions the question arises whether a better approach to annotating the
factuality of events is achievable.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>If events are extracted from text in a ne-grained manner, huge amounts of events
are gathered, but only a fraction of them represent real events and, hence, are
worthwhile to process further on. In this paper, we gave an overview of existing
linguistic work about the detection of real events. In order to evaluate a proposed
system which extracts semantically-structured, real events from text, we de ned
requirements and proposed a methodology to create a gold standard dataset.
Preliminary experiments with crowdsourcing showed that the annotation of text
with factual information is non-trivial. Still, we believe that the creation of such
a dataset is necessary for many event detection systems in the future.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Gabrilovich</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dumais</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horvitz</surname>
          </string-name>
          , E.:
          <article-title>Newsjunkie: providing personalized newsfeeds via analysis of information novelty</article-title>
          .
          <source>WWW '04</source>
          , New York, NY, USA, ACM (
          <year>2004</year>
          )
          <volume>482</volume>
          {
          <fpage>490</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Karkali</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rousseau</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ntoulas</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vazirgiannis</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>E cient Online Novelty Detection in News Streams</article-title>
          . In
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          , et al.,
          <source>eds.: Web Information Systems Engineering { WISE 2013</source>
          . Springer Berlin Heidelberg (
          <year>2013</year>
          )
          <volume>57</volume>
          {
          <fpage>71</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3. Zhang,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Callan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Minka</surname>
          </string-name>
          ,
          <string-name>
            <surname>T.</surname>
          </string-name>
          :
          <article-title>Novelty and Redundancy Detection in Adaptive Filtering</article-title>
          .
          <source>SIGIR '02</source>
          , New York, NY, USA, ACM (
          <year>2002</year>
          )
          <volume>81</volume>
          {
          <fpage>88</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zi</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>L.G.</given-names>
          </string-name>
          :
          <article-title>New Event Detection Based on Indexing-tree and Named Entity</article-title>
          .
          <source>SIGIR '07</source>
          , New York, NY, USA, ACM (
          <year>2007</year>
          )
          <volume>215</volume>
          {
          <fpage>222</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Croft</surname>
          </string-name>
          , W.B.:
          <article-title>Novelty Detection Based on Sentence Level Patterns</article-title>
          .
          <source>CIKM '05</source>
          , New York, NY, USA, ACM (
          <year>2005</year>
          )
          <volume>744</volume>
          {
          <fpage>751</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Kosmerlj</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Belyaeva</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leban</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fortuna</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grobelnik</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Crowdsourcing Event Extraction</article-title>
          .
          <source>In: NewsKDD { Workshop on Data Science for News Publishing at KDD</source>
          <year>2014</year>
          .
          <article-title>(</article-title>
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Xie</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Passonneau</surname>
            ,
            <given-names>R.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Creamer</surname>
            ,
            <given-names>G.G.</given-names>
          </string-name>
          :
          <article-title>Semantic Frames to Predict Stock Price Movement</article-title>
          .
          <source>In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics</source>
          . (
          <year>2013</year>
          )
          <volume>873</volume>
          {
          <fpage>883</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhu</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ding</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <article-title>: Multi-document Summarization via Sentence-level Semantic Analysis and Symmetric Matrix Factorization</article-title>
          .
          <source>SIGIR '08</source>
          , New York, NY, USA, ACM (
          <year>2008</year>
          )
          <volume>307</volume>
          {
          <fpage>314</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Yeh</surname>
            ,
            <given-names>P.Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Puri</surname>
            ,
            <given-names>C.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kass</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>A Knowledge Based Approach for Capturing Rich Semantic Representations from Text for Intelligent Systems</article-title>
          .
          <source>Int. J. Adv. Intell. Paradigms</source>
          <volume>2</volume>
          (
          <issue>1</issue>
          ) (
          <year>November 2010</year>
          )
          <volume>33</volume>
          {
          <fpage>48</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Bejan</surname>
            ,
            <given-names>C.A.</given-names>
          </string-name>
          :
          <article-title>Learning event structures from text</article-title>
          .
          <source>PhD thesis</source>
          , The University of Texas at Dallas (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Bach</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          :
          <source>The Algebra of Events. Linguistics and Philosophy</source>
          (
          <year>1986</year>
          )
          <volume>5</volume>
          {
          <fpage>16</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Dowty</surname>
            ,
            <given-names>D.R.</given-names>
          </string-name>
          :
          <article-title>Word Meaning and Montague Grammar: the semantics of verbs and times in generative semantics and in Montague's PTQ</article-title>
          .
          <string-name>
            <surname>Reidel</surname>
          </string-name>
          (
          <year>1979</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Vendler</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          : Linguistics in Philosophy. Cornell University Press (
          <year>1967</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Moens</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Steedman</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Temporal Ontology and Temporal Reference</article-title>
          .
          <source>Computational Linguistics</source>
          <volume>28</volume>
          (
          <issue>3</issue>
          ) (
          <year>1988</year>
          )
          <volume>15</volume>
          {
          <fpage>28</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Pustejovsky</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>The syntax of event structure</article-title>
          .
          <source>Cognition</source>
          <volume>41</volume>
          (
          <year>1991</year>
          )
          <volume>47</volume>
          {
          <fpage>81</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Dorr</surname>
            ,
            <given-names>B.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Olsen</surname>
            ,
            <given-names>M.B.</given-names>
          </string-name>
          :
          <article-title>Deriving Verbal and Compositonal Lexical Aspect for NLP Applications</article-title>
          .
          <article-title>Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (ACL) (</article-title>
          <year>1997</year>
          )
          <volume>151</volume>
          {
          <fpage>158</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Palmer</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <source>Mood an Modality</source>
          . Cambridge University Press (
          <year>1986</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Horn</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>A Natural History of Negation</article-title>
          . University of Chicago Press (
          <year>1989</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Sauri</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pustejovsky</surname>
          </string-name>
          , J.:
          <article-title>From structure to interpretation: A double-layered annotation for event factuality</article-title>
          .
          <source>Proceedings of the Second Linguistic Annotation Workshop</source>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Wiebe</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Wilson,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Cardie</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          :
          <article-title>Annotating expressions of opinions and emotions in language</article-title>
          .
          <source>Language Resources and Evaluation</source>
          <volume>39</volume>
          (
          <issue>2</issue>
          ) (
          <year>2005</year>
          )
          <volume>165</volume>
          {
          <fpage>210</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Miltsakaki</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prasad</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joshi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Webber</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>The Penn Discourse Treebank</article-title>
          .
          <source>Proceedings of LREC</source>
          <year>2004</year>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Pustejovsky</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , et al.:
          <source>The TIMEBANK Corpus. Proceedings of Corpus Linguistics</source>
          <year>2003</year>
          (
          <year>2003</year>
          )
          <volume>647</volume>
          {
          <fpage>656</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Pustejovsky</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Knippen</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Littman</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saur</surname>
          </string-name>
          , R.:
          <article-title>Temporal and event information in natural language text</article-title>
          .
          <source>Language Resources and Evaluation</source>
          <volume>39</volume>
          (
          <issue>2</issue>
          ) (
          <year>2005</year>
          )
          <volume>123</volume>
          {
          <fpage>164</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Walker</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Strassel</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Medero</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maeda</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>ACE 2005 Multilingual Training Corpus LDC2006T06 (</article-title>
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>